From the course: Automating Your Work with Custom GPTs (No Code Required)
Extract text from images - ChatGPT Tutorial
From the course: Automating Your Work with Custom GPTs (No Code Required)
Extract text from images
- [Instructor] Let's build a GPT that can pull text from an image. This can come in handy in all kinds of situations. Maybe you have a handwritten note that you want to convert into editable text, maybe you need to repurpose some text from a printed document that you took a photo of. We'll start again by going to Explore GPTs, and we'll come up and click Create to create a new GPT. And here under the Create tab, let's say we want to make an assistant to extract text from an image and convert it to plain text that can be easily copied and pasted elsewhere. The GPT builder offers to name it Plain Text Extractor. I'll just say, "That's fine." Next, it'll generate an image for our GPT. And again, I'll just accept that. But again, you can ask for it to refine the image or redo it if you like. So now it's asking me if I want to give this GPT a test run, but in this case, I do want to refine the assistant a bit. I should probably specify what kind of images we'll primarily be using. Maybe I wanted to explain to the user if it has problems. For example, if it has difficulty extracting text from a low resolution image or images of complex backgrounds, it should let the user know. So let's tell it, "This assistant should work for photos, screenshots, handwritten notes, or any other image where text is discernible. Alert the user if there are issues with the image that make it difficult to extract the text and offer suggestions on how to improve the results." So we will again take that information and refine our GPT. And maybe lastly, I'll just add, "Keep explanations brief, but offer to provide detailed guidance if requested by the user." All right, now before we test it, let's go over to the Configure area again. And here we can see the details of what we created. We can again change the name or description here if we want. And always be sure to read through the instructions and make any changes or refinements that you think might be necessary. If you're not sure what to write in here, you can always go back to the Create tab and make your changes conversationally instead. Again, we can edit, add, or delete conversation starters. Maybe I'll just leave two to keep things simple. And we don't need to upload any files that the GPT needs to know about ahead of time in this case. I probably also won't need web searching, Canvas, or DALL-E image generation. I'll just leave them checked because we don't always have to uncheck these things if we don't need them. But let's come over here to the Preview area and try it out. I'll click the + button to upload a file I have on my desktop. So you can see this is a screenshot of a hotel room. We have the text over here on the left. You can see it's formatted in different sizes. We have some bold text, some metallic text, and so on. Let's see how ChatGPT does with this. With it selected, I'll click Open and we'll send that. And I don't even need to type in a prompt because this is a custom GPT where I've already created instructions on what I wanted to do. You can see it successfully pulled the text out of the image without any line breaks, making it very easy to copy and then paste into another application. And ChatGPT is actually very good at discerning text. As long as you have a decently clear image, it should be able to do a pretty good job with scanned, photographed, or handwritten text. All right, so I'll go ahead and click Create and save this GPT for myself. And we can view the GPT. So if you ever find yourself in situations where you need to convert images of text into actual text you can edit or copy and paste, you'll probably get a lot of good use out of a text extraction GPT.