Popular open-source AI platform, ChatGPT, is now offering GPT-4 to its users – no longer requiring access via a paid subscription.
In this article, we’ll discuss how ChatGPT can extract and summarise information from PDFs and what factors to take into account when using GPT-4 for enterprise and personal use.
GPT-4 offers superior reasoning abilities compared to its predecessor, GPT-3.5 – meaning that captured data is likely to be more accurate.
GPT-4 incorporates GPT-4Vision: a model for processing images, accessed via the GPT-4 API. Therefore, a key difference between GPT 3.5 and GPT-4 is that GPT-4 is a multimodal model, meaning that it can process multiple file types, including video, audio, image and text.
In terms of data capture, the user can now upload files, including PDFs and data files. GPT-4 can then respond to prompts based on your input.
One thing worth noting is that GPT-4’s only output is plain text, meaning its responses are difficult to export and edit.
This step may take a few seconds.
If you can't see an image button, ensure that you've switched over to GPT-4.
When entering the specified prompt to give ChatGPT-4, remember to follow the general principles of prompting Large Language Models (LLMs):
Try the prompt: ‘Extract the data from this image as plain text’. GPT-4 will then generate a widget with a displayed ‘copy text’ function.
Other variations of the prompt above that you can use to extract and then process data with ChatGPT.
Never skip quality-checking ChatGPT-4’s extracted information. LLMs trained for public use will inevitably contain minor bugs with potentially major consequences. Consider, for example, this viral Tweet [NSFW], which shows that the AI-generated response was not manually reviewed and modified.
There are numerous examples of ChatGPT failing to give well-reasoned responses. Responses will vary depending on the prompt and the context of the user’s interaction, meaning that two users could provide an identical prompt and, theoretically, receive different answers. There is no official method for maintaining consistent quality.
The process of summarising PDFs with GPT-4 is almost identical to extraction. All it takes is a willingness to validate the end result, a well-defined prompt and some deft copying-and-pasting. Simply upload the document and prompt it for a summary, remembering to define the parameters as closely as possible.
For example, you might request:
In this next section of the article, we'll discuss how to build extraction and summarisation from PDFs into an automated workflow with GPT-4 at its core.
Repeatedly uploading documents into GPT-4’s interface is inefficient and time-consuming, especially in the long run. Instead, for small batches of documents, you can use an online connectivity tool like Zapier to quickly establish an automated workflow between ChatGPT and a program such as Google Sheets. Though we use Zapier in this example, the process is similar for other iPaaS, such as Workato or SnapLogic.
Let’s cover how to automate summarisation and extraction from GPT-4, step-by-step:
*Note: Include a clear header in the sheet, so the programme knows where to deliver the summary.
In this example workflow, you would need to connect your Dropbox, ChatGPT and Google accounts.
Similar to the image above, configure the workflow end-to-end so that each document goes through: Dropbox (or any other mainstream storage platform your company uses) < ChatGPT < Google Sheets.
As an optional extra, you can add a final component to the workflow that will notify you via email or Slack when the program generates a summary.
When configuring ChatGPT, you will also need to generate an API key. Then simply add the API key where specified by the programme.
Input a clear prompt, such as:
After testing the trigger (generally, there will be a function on Zapier or the connective platform to do this), your summaries will appear in the allocated Google Sheets file like this:
For extraction from batch files – especially in parallel – consider a tailored data extraction solution to expedite the process.
ChatGPT’s 3.5 model struggles to attribute general information. For example, if you enter a prompt about something that doesn’t exist, the AI will often run with it. If you ask the model to source its response, it will create fictitious references.
An egregious example of GPT-4's failure to extract datapoints accurately is from BP's Remuneration Report 2022. We requested a summary of this page:
Though GPT-4 correctly transcribed some of the figures on this page, these were meshed with fabricated data points:
Here, GPT-4 makes a number of incorrect statements. For example, it states that Bernard Looney's cash in lieu of retirement benefits amount to£230,000 (rather than £206,000). It also states that Murray Auchincloss's benefits amount to £118,000 (rather than £88,000), and his cash in lieu of retirement benefits are £107,000, rather than £117,000.
We then replicated the prompts in another browser - and GPT-4 generated different data. For example, note here that it claimed that Auchincloss's benefits amounted to £117,000, rather than £88,000 (what is correct) and £118,000 (what it stated previously).
Inconsistency between users is an issue that makes GPT-4 an unwieldy tool for enterprise use. Though GPT-4 offers ‘broader general knowledge and problem solving abilities’, it is still not a reliable or coherent source of information.
Implications for enterprises
For enterprises, misattribution is especially problematic. For instance, if you discover an anomalous data point, it’s crucial to work out whether it’s due to an extraction error or if the data point was already present in the document. The easiest way to do this is to click on the data point and source it in the uploaded document.
Working with a data extraction solution with high confidence scores is helpful when addressing potential issues with attribution. A confidence score indicates the probability that the algorithms’ output is correct.
Overall, GPT-4 may not be a suitable tool for all use cases. We tested it ourselves and discovered that it occasionally failed to extract from financial statements correctly. For the balance sheet below, GPT-4 failed to recognise nuclear fuel and construction in progress as long-term assets, confidently - and incorrectly - stating that the net PP&E (Plant, Property & Equipment) values were '$78,119 million for 2020 and $74,349 million for 2019'. Therefore, the algorithms displayed poor judgement, creating a false statistic.
For businesses, errors like this create misleading impressions with potentially catastrophic consequences.
***
Moreover, GPT-4's muddled calculations in this example fall below the capabilities of a human accountant, making it an unreliable tool. Eagle-eyed users may note that GPT-4 now has a warning at the bottom of its interface: 'ChatGPT can make mistakes. Consider checking important information'. Well said.
Editing extracted data
While you can always correct an error in captured data, it’s worth reviewing the simplicity of the editorial process, depending on the data extraction tool you use.
For example, if ChatGPT makes an error when extracting data, you’ll need to correct the output or request an altered version manually. In contrast, some bespoke data extraction services offer a user-friendly interface for validating and verifying ambiguous datapoints.
In sum, to maximise accountability and prevent the recurrence of errors, ensure that all data captured is attributable.
Although ChatGPT is available as an API, it requires specialist coding to embed it into your company’s IT infrastructure. For certain companies, integration via API may be a cumbersome or inappropriate option, compared to lightweight methods such as file transfer via SFTP or through no-code platforms like Workato.
Running data through multiple sources contravenes a number of data security regulations. For industries relying on high information security – healthcare, financial services, etc – ChatGPT is a risky option.
Although it’s possible to extract and summarise data from PDFs using ChatGPT, it isn’t an enterprise-ready solution.
For secure, seamless and specialised data extraction from PDFs, try our platform – Transcribe – where you can upload documents and download structured data for free.
Ready to extract from PDFs with ease? Book a demo with one of our experts or send us an email at hello@evolution.ai today.