Jump to content
  • 0

Caspio + AI: Simple Guide to Document Understanding


andRe

Question

I've noticed that the topic of using AI in Caspio apps hasn't gotten much attention on this forum. This is surprising, given the big advancements in AI over the past few years. 

Recently we discussed this topic during the Partners’ Office Hours with @nikcaspio and decided to test the water sharing some ideas on the forum and hearing what the community thinks. I'd gladly give more examples in future posts if you find these thoughts helpful. The topic is almost endless, with so many ways to use AI. 

But let's start simple:

WHAT IS DOCUMENT UNDERSTANDING

We all know how Caspio can help create outgoing PDF documents based on database records. But what about incoming documents like invoices, receipts, delivery notes, or bills of lading? These often come as files or even paper documents. Here, we have to do the opposite of document creation. We need to take the document and turn it into data. Usually, someone has to enter this data manually into Caspio, right?

Well, that was the old way :) Things have changed.

To show what's possible now, I made a short video using our Caspio app, where I upload a JPEG file with an invoice, and the system extracts the necessary data from it and put it into Caspio:

https://www.loom.com/share/d21eb53d9b5a4540a6282a56d8d47f0e

Is it magic? No, it's simply artificial intelligence. And a pretty inexpensive one. I paid only a few cents to our "digital elves" for assisting me in recognising these documents in the video. The cost of human labor would most likely be much higher.

TECH STACK

Of course, you first need a Caspio plan with REST API and Zapier Integration. The Professional plan offers it, but you may buy it as an add-on to smaller plans.

In this example, we used Google Document AI's Invoice Processor to do the work. It has some limits but is a good place to start and see quick results. And you can overcome these limits with more advanced methods. However, for this demo, it's more than enough.

At first, I considered using Zapier for this demo because it was easy. But I know many people here find Zapier too expensive. So, we chose a more budget-friendly option. We used the newly introduced Caspio Webhooks and Make (Integromat) as middleware. This is just an example; you can use any technology you like.


MAIN COMPONENTS:

To make Document Understanding work, you need three main parts:

  1. OCR (Optical Character Recognition): This tech changes a binary file (like a photo taken with your phone) into text.
  2. Text Extractor: This tool looks for specific info (like invoice numbers or dates) and gives it back in an organized way.
  3. Middleware: This is the connector between Caspio and AI services. It takes a file from Caspio, sends it for processing, returns the data, and puts it into a Caspio table.


GOOGLE DOCUMENT AI

Google Document AI can do tasks #1 and #2.

You can learn more about this service here  and here's how to set it up.

Of course, Google Document AI is not the only option, but it's an excellent place to start. Once you know its limits, you'll know how to work around them. There are many ways to do this. For example, in most of our projects, we use Google Document AI mainly for OCR. Then, we process the text with a tool from OpenAI, the creators of ChatGPT. But that's more advanced, and I don't want to make things too complicated in this article.


MIDDLEWARE

Google Document AI comes with a robust and well-explained API. But it needs incoming calls to have a particular format and gives back answers in its own format.

If you're already using Zapier with Caspio, you might find it the easiest option, and you probably know how to use it. If so, skip to the Make Scenario section below for extra ideas. But if you're not using Zapier, stay with me. This post will focus on Caspio Webhooks as a more budget-friendly choice.

Webhooks are good, but they have a limit: they're set in a standard way and don't allow much change. So, you'll need another tool to take the webhook call from Caspio and change it into a format that Document AI understands.

There are many choices for this, and it's a big topic that could have its own set of detailed articles. For this demo, we'll use Make (Integromat). We find it even better than Zapier, but cheaper. They also have an excellent free plan that might be enough for smaller projects. Finally, Caspio has built a connector for Make, making it easy to set up, even if you're not an API expert.


CASPIO CONFIGURATION

To keep it easy, we turned on the FileStor option. This lets us get the file from Caspio using a direct link. I know this isn't the most secure way to handle files, but that's not the main point of this article, so I hope you understand.

We also made a table and a few data pages in our demo app. These let you create a record and upload a file. Next, we set up a Caspio webhook that starts when a new record is added (Insert event). This webhook sends the call to Make's incoming webhook. Finally, we made a Web service profile so Make can talk to Caspio using REST API and return the results to the table.

All these steps are well-covered in the Caspio manual, and there are also good video tutorials. So, I won't go into the details here. (If you need those, just ask in the comments, and I can post the links.)


SCENARIO IN MAKE

a9mXmcmHlijKst63JygPkCyZdEXdaWYN16I80G2_n6twDFyOTHOOr6Y-68GARGGaKFeBh1-025uaxTGhKlDBvu9ckGljFaCF6n1vbhGqIGlMAFvjutSDsKnXIMzJauKOYwVx-23ZA4p3VwD6htCqT8c

The process has just six steps in order:

  1. An incoming webhook that waits for calls from Caspio.
  2. A 10-second pause to give Caspio time to send the file to FileStor.
  3. A simple GET request to get the file from FileStor.
  4. Some minor changes to the data to get it ready for Document AI (details below).
  5. A call to Document AI to send the prepared data.
  6. A standard Caspio connector that puts the data back into the Caspio table.

If you're new to Make or APIs and find all of this confusing, let me know if you'd like more details. I intended this post to be an outline rather than a step-by-step guide. However, I can explain things further if needed. It's not rocket science.

But If you're familiar with Make or at least REST API, most of these steps should be pretty straightforward. The only part that might need some extra attention is step #4, where we set up headers and converted the binary file to base64. Here's how we set it up:

Px6ChGgpquTthHnGCUsIbd-zlXjNA5xtw2Pb55-z9yGKAjjv3xpqO_aTX-kmoS_TyEtYx9Hv5tZxElC6zoAegli7kuhMAhh9-DRH0d1BWQgl_j6nUh-x9sSZhR3y_gAADw-zRsxmSdnAYd8v-aXo9fs

If you've done everything right, you should be good to go!

Not too hard, right?


MORE ON THE TOPICS

In this post, we've just touched the tip of the iceberg on Document Understanding. It's a big topic that could fill many more articles and take hundreds of hours to make a really deep dive. This is especially true for understanding complex documents like contracts or unstructured text documents. And the real magic happens when you move past ready-made solutions and train AI models with your own data. Learning all this can be a fun adventure, and the possibilities are endless.

However, even with simple setups, AI can be a game-changer for many businesses. It can save time on dull tasks and let your team focus on more important work instead of just copying data from paper to an app.

But AI isn't just about Document Understanding; it has lots more to offer. Here are a few more examples:

Semantic Search: This isn't just looking up keywords; it understands the meaning of your question, irrespective of the words you choose. This can be useful for Knowledge Management and Customer Support apps.

Voice-to-Text: Turn your video calls into text, summarize what was said, and add it to your Caspio app as follow-up notes. This is good for CRM, Project Management, Knowledge Management, and Recruitment apps.

Classification of Incoming Requests: Automatically sort new requests and applications based on its content. This can help with CRM, Customer Support, and Recruiting apps.

And there's much more.


FINAL REMARKS

I must say, this article ended up being longer than I first thought it would be. So, big thanks to you for sticking with me till the end! :D

If anything was unclear or you ran into issues, please feel free to leave a comment or write me a DM if you prefer. I'll do my best to help you out.

Now, I'd really like to hear your thoughts. Did I explain it well, or did it just confuse you? Does it make sense for you?  Are you interested in this topic at all? Do you want to see more articles like this one? If so, what would you like me to focus on? More technical details? Real-world examples? A different writing style?

I welcome your feedback and constructive criticism. 

Thank you!

Link to comment
Share on other sites

8 answers to this question

Recommended Posts

  • 1

Did I explain it well, or did it just confuse you? 

         You explained this very well, in a way that I imagine gets a lot of us excited about the potential of AI and how it can already help us out. Your article is very easy to follow, and the steps you laid out are great!

Does it make sense for you?  

         100%! I think even novice developers would be able to follow this article with little confusion.

Are you interested in this topic at all?

         Very interested, both in the different ways AI can be used to help developers and users with Caspio. This Document Understanding solution is one that I believe many of us will appreciate.

Do you want to see more articles like this one?

         Yes, please! 

If so, what would you like me to focus on?

         More ways AI can be integrated into Caspio. I am under the belief that this is the next big change for cloud computing and "low-code" solutions.

More technical details?

         I think what you have included here in this article is great - an overview and quick tutorial on how to set this up. If more technical details are warranted, there is always the comments section or DMs!

Real-world examples?

         A few scenarios that I can think of that would greatly increase our experiences on Caspio, but might be a bit farfetched at the moment: 

  • Database Query Assistance, where users can input search terms such as "SUV" and have an output of values such as "Audi Q3, BMW X5, Chevrolet Tahoe" for rental car databases (thanks, @BenjaminS!) or perhaps "Kawneer 451" to spit out different glazing systems that utilize that style of storefront. 
  • User Training, where AI can scan through materials and generate training materials and methods to help kickstart users on how to use an application, or to enhance the onboarding process of new employees.
  • Custom Application and Data Modeling/Schema Design - Share the guidelines for your next application and have AI give you a head start by creating the tables and datapages needed to get you started on your development.

A different writing style?

         The way the article is laid out is great, in my opinion.

 

Lastly, THANK YOU for getting the ball rolling on topics like this. I think there are a lot of people here like myself that have been waiting for advancements to AI and how it can help us and our users, but weren't sure how to get started or do anything like this. Your Document Understanding solution opens my eyes on different ways we can incorporate AI, even if it isn't a solution directly from Caspio itself! It has been a while since I last felt some excitement about the Caspio sphere and future ahead.

Link to comment
Share on other sites

  • 0

This a really great article--I use Make.com for some record updating but haven't yet used Webhooks or AI integration. I find Make.com really confusing, though, compared to Zapier (which is ridiculously expensive). I'll give this angle a shot--thanks for posting!

Link to comment
Share on other sites

  • 0

Hello! Caspio releases another new enhancement to the WebHooks feature, refer below: 

Caspio Webhooks allows your applications to communicate with third-party systems in real time when changes to data occur. 
In this release, we introduced the following updates:

  • In non-HIPAA/Compliance accounts, webhooks logs now show the information sent and received.
  • All accounts, including HIPAA/Compliance, now can:
    • Send webhooks based on table events initiated by triggered actions and tasks.
    • Configure webhooks for tables with triggered actions enabled for the same event as the webhook. 
  • A single REST API call can be used to activate or deactivate all events for a selected webhook.

Source: https://howto.caspio.com/release-notes/caspio-47-0/#:~:text=Enhancements to Caspio Webhooks

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...