On this case, assuming I’m the owner of an ecommerce website. I would love to create a Chatbot, so my users can ask specific questions regarding anything about this website (price, product, service, shipping, etc.) as they’re in the shop. The Chatbot shall be supplied with the “private knowledge” and ground its answers to the contents of the web site.
Given I’m not actually owning an ecommerce website, I’ll take a workaround to crawl contents from an existing website available on the Web. This is difficult because most web sites are anti-scraping as laid out in their terms of use, and it could possibly be illegal to scrape ecommerce web sites akin to Amazon, eBay, Alibaba, etc.
ChatGPT provided me with an ideal option —
Books to Scrape (https://books.toscrape.com/). A simulated bookstore specifically designed for web scraping practice. It offers an easy structure for scraping book details like title, price, and rating.
On this use case, I’d assume I’m the owner of this Books to Scrape website, and create the Chatbot based on it.
This might look a bit lengthy at first glance since it covers every detailed step that you will have. Once you have got run through, you may get the identical done inside 5 minutes.
Step 1: Environment Setup
The tool we’re going to use is sitting on Google Vertex AI and we’ll need a Google Cloud Platform (GCP) account.
Google has a free-tier program to supply latest Google Cloud Platform (GCP) users with a 90-day trial period that features $300 as free Cloud Billing credits.
Follow the tutorial here to establish the free Google Cloud account.
After you have got arrange Google Cloud account and might access the console, create a storage bucket (step-by-step guide here) for the following step use.
Step 2: Prepare Private Knowledge and store them into Google Cloud Storage (low code)
As mentioned above, the private knowledge on this case shall be the contents sitting on the book store website.
For owners of ecommerce web sites, all you have to do is to supply the web site URLs, and Google can routinely crawl website content from an inventory of domains you define.
Given I’m not an actual owner, I’ll resolve this by scrawling. Alan Blount from Google provided a really useful notebook to attain this. All of the code snippet does is to scrawl webpages from the web site that you just specified and store them in a Google Cloud Storage bucket that you just specified.
That is all you have to do:
2.1 Save a replica of the notebook in your individual drive
Recall that in step 2 you have got created a brand new Google account while you registered for Google Cloud? Your Google account could have Google Drive and you may save a replica of this notebook to your drive.
Select “Save a replica in Drive” option from the dropdown menu of “File”
Then if you happen to go to Google Drive, you’ll have the opportunity to see the notebook you created. Be at liberty to rename it in line with your need.
2.2 On your individual notebook, locate the below and specify
website_url
refers back to the website page URL that you desire to to scrawl.
storage_bucket
refers back to the Google Cloud Storage that you just created in above step 1.
metadata_filename
refers to a json file that shall be created and stored along with the webpages. You may intend to make it relevant to your website by changing applied_ai_summit_flutter_search
to something that may describe your use case.
That is my version:
2.3 Run all
2.4 When it prompts you to authenticate the Google Colab notebook to access your Google credentials, click “Allow” -> “Proceed”
Then the script should run through and show the progress of the scrawling at the underside, identical to this:
And if you happen to consult with your Google Cloud storage bucket, you will note these html files get scrawled and stored properly inside your bucket:
One thing to note is that the code snippet shouldn’t be designed for each use case, and you would possibly need some slight tuning of the codes to attain your goal.
For instance, in my case, I tuned the code a bit by changing
blob.upload_from_string(html_string)
into
blob.upload_from_string(html_string, content_type='text/html')
By default the html_string
shall be uploaded as text/plain
. By turning into text/html
, I would love to enable this HTML contents to point out properly in a later stage.
You may tune the code as much as you want.
Step 3: Create Chatbot and the Data Store sitting behind the Chatbot (no code)
Go to Google Cloud Console (https://console.cloud.google.com/) and kind “search and conversation” because the service:
Create “NEW APP”:
Select “Chat”:
Provide your “Company name” and “Agent name”. Note that the “Agent name” here shall be the name of the Chatbot, it is advisable to put name to your users.
At this “Data” page, select “CREATE NEW DATA STORE”:
For owners of ecommerce web sites, select “Website URLs” and provision your website URLs
As I even have scrawled the web site contents into Cloud Storage, we are able to select “Cloud Storage” here:
Specify the Cloud Storage bucket name, and choose “Unstructured documents” in below:
Give your data store a reputation, then “CREATE”
You will note your data store listed, then “CREATE”
Your data store shall be created as below
If you happen to click into it, you will note your data store is “processing data” by importing documents from the Cloud Storage bucket that we specified earlier:
If we click the “ACTIVITY” tab, we are able to see the import is in progress:
Import will take minutes to hours depending on the variety of documents in your Cloud Storage bucket.
In my case, I even have over 1,000 files and it finishes inside minutes.
After import is accomplished, the status as highlighted has modified:
And if you happen to switch back to the “DOCUMENTS” tab, you will note the list of files imported into the information store:
Meaning you’ve got all of the materials and you might be able to cook!
Step 4: Test the Chatbot (no code)
In step 3 above, we now have already created a Chatbot app in addition to the information store sitting behind it.
Click “Apps” on the highest:
You will note the Chatbot you created within the previous step 3:
If you happen to click into the Chatbot name, you shall be directed to the Dialogflow CX page like below:
To check the Chatbot, select “Test Agent” in the precise up corner:
And the dialogue box will pop up:
You may start the conversation by saying “hi” and begin asking inquiries to the Chatbot:
It really works!
Step 5: Publish / Integrate your Chatbot (low code)
If you happen to are comfortable with the Chatbot, it is straightforward to integrate it together with your web application
Go to the left pane, select “Manage” -> “Integrations” -> “Dialogflow Messenger”
You may select the style of API and UI style in line with your needs
For demo purpose, I chosen “Unauthenticated API” as API and “Pop-out” as UI style:
After choosing “Done”, a code snippet in HTML shall be generated in the following page as below:
You could copy the code snippet and simply paste it into your applications for integration.
For demo purpose, I copy paste this HTML snippet into JSFiddle and run it, then I get my little Chatbot working as shown in the precise down corner!
Step 6 (Optional): Publish it through a Beautiful Application (low code)
In case you don’t have an application yet and you would like to have one, Google provides place to begin through a public git repository Chat App.
It is a Chatbot Application written in Node.js and you may easily adapt it for your individual use by changing the code snippets a bit inside chat-app/src/routes/+page.svelte
.
You have to to vary the project-id
, agent-id
and chat-title
into yours.
And when you run/deploy the app, you’ll get the net UI like this:
After all you may change the looks of the UI as you want.
Now you may have your individual application!