Skip to content

dartmouth-cs52-18S/workshop-ws-5-1-apis-google-cloud-vision-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS52 Workshops:  Google Cloud Vision API ☁️

In our presentation we talked about using different APIs like IBM'S Watson, Microsoft's Azure, and Google's Cloud Vision, to quickly add Artificial Intelligence to our applications. We want to introduce you to the Google Cloud API, which will allow you to get really cool, invaluable insight from images.

Overview

The Google Cloud Vision API allows us to understand the contents of an image. You can use it for label detection, logo detection, explicit content detection, face detection, image attributes, web detection, and more. But today, we are going to use the Google Cloud Vision API to identify an uploaded image using label detection (tagging an image with attributes and text recognition (OCR).

Setup

Claim your Google Cloud coupon:

Click here to obtain your free Google Cloud coupon (thanks Tim!). Google Cloud requires a billing account with a valid credit card, but with this coupon, we'll bypass that entirely.

Enter your Dartmouth email and name. Then go to your email to verify your account.

After verifying your email, you should receive another email with your coupon code. Click the link they provided. You might have to sign in to your Google account if you aren’t signed in already.

Create New Project

Now, create a new project in your google account by clicking here.

Enable billing for project 💸

Go to this link to see your billing accounts.

If you only have one billing account, move on to the ENABLE API section.

If you have multiple billing accounts, do the following to set up the project correctly:

  1. Go to the Google Cloud Platform Console and open the console left side menu and select Billing.

  2. If you have more than one billing account, you'll be prompted to select Go to linked billing account to manage the current project's billing.

  3. Under Projects linked to this billing account, locate the name of the project that you want to change billing for, and then click the menu next to it (3 dots).

Our project is called demo, but feel free to call your project anything you want! the page should look like this

Select Change billing account, then choose the desired destination for the billing account and click Set account.

Enable API:

Click here to enable the API for your project. The page should look like this:

Click select a project at the top and choose your project name. Then click Open.

You should be at a page that looks like this.

If you don't have the sidebar on the left, click the three horizontal menu bars on the top left.

From this sidebar, click APIs & Services and then click Dashboard.

You should be taken to a page that looks like this.

Scroll to the bottom that has a list of APIs and click Cloud Vision API.

If Cloud Vision is not enabled, you should be taken to a page that looks like this.

Click Enable at the top.

If Cloud Vision is already enabled, you would just see a disable option. So do nothing.

Authentication

If you haven't already, fork this repo to get our starter code.

To use this client library, you must first authenticate. Complete these steps:

Go to the Create service account key page in the GCP Console: Service account key page

Make sure your project name is showing at the top and it is the right project (ours is called demo).

  • if it doesn't work select Credentials from the left sidebar under APIs & Services
  • select the Create Credentials dropdown menu
  • click Service Account Key

From the Service account drop-down list, select New service account.

Enter a name into the Service account name field.

From the Role drop-down list, select Project > Owner.

Click Create. A JSON file that contains your key will download to your computer. Save the file in your forked repo root directory as key.json.

Initial Environment Set-up/Checks:

Check then npm version you have installed currently using node -v If you don't have version 9.8.0 you need to rollback your version. If you don't, the demo won't work. Run the following in your terminal:

npm install --global [email protected]

Step by Step 🔢

cd into the root directory of the repo and run npm install to install all of the package dependencies. Don't run yarn. Running yarn will overwrite some of the packages needed for this demo to work.

‼️ Make sure that you're running version 9.8.0 of node. Otherwise, running npm install will cause a lot of nasty errors

This is going to be a mini React App that takes in an image and returns a label describing what is in the image (image recognition). The React app runs an Express server to send POST requests to the Google API server. As a reminder, a POST request requests that a web server accepts the data enclosed in the body of the request message in order to do something with that data. In our case, this data is an image. For example, say we upload the following image:

The Google API server will return a JSON file that looks something like this:

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/0bt9lr",
          "description": "dog",
          "score": 0.97346616
        },
        {
          "mid": "/m/09686",
          "description": "vertebrate",
          "score": 0.85700572
        },
        {
          "mid": "/m/01pm38",
          "description": "clumber spaniel",
          "score": 0.84881884
        },
        {
          "mid": "/m/04rky",
          "description": "mammal",
          "score": 0.847575
        },
        {
          "mid": "/m/02wbgd",
          "description": "english cocker spaniel",
          "score": 0.75829375
        }
      ]
    }
  ]
}

containing the identification tags for the image sent.

But wait, what's Express?! 😱 Sneak peek to the short assignment coming out today: Express is a server-side JavaScript framework that adds another dimension to the layers of web development that we've already learned. It’s a lightweight framework that gives developers extra, built-in web application features.

In our case, the Express server is used to facilitate the client-server interaction between our app and the Google Cloud Vision API.

Without further ado, let's get started!

  1. Don't forget to run npm install! In the src directory, create a file titled index.js and copy and paste the following code into the file:
import React from 'react';
import ReactDOM from 'react-dom';
import './index.css';
import App from './App';

ReactDOM.render(<App />, document.getElementById('root'));

You should be a React expert by now, so you'll know that this file takes in an App component and renders the HTML element with the #root id with that component. We could put all of the React code into this file and the app would run just fine; however, as we've learned, it's better to include the App component in a separate file and render it from there because it keeps the app more organized.

  • have .eslintrc errors? try running npm i eslint-config-airbnb
  1. Also in the src directory, create an App.js file and copy and paste the following code into the file:
import React, { Component } from "react";
import "./App.css";

class App extends Component {
  constructor(props) {
    super(props);
    this.state = {
      image: "",
      imageT:"",
      identification: "",
      identificationT:""
    };
  }

  render() {
    return (
      <div>
        Google Visiony things will be happening here soon
      </div>
    );
  }
}

export default App;

No further explanation needed here – at this point, you should be able to create a new React component in your sleep 😴. Now, open your terminal and run npm start. A new tab should open in your browser with the app. Ooooh, magic 💥

(jk, it's actually an npm package called react-scripts)

  1. Now, make a new directory called server in the root directory and in it, a new file called server.js. This is where our mini-Express server will go. Copy and paste the following code into the file:
const express = require("express");
const app = express();
const bodyParser = require("body-parser");
const multer = require("multer");
const path = require("path");
const storage = multer.memoryStorage();
const upload = multer({ storage: storage });
const vision = require("@google-cloud/vision");
const client = new vision.ImageAnnotatorClient({
  keyFilename: '../key.json',
  project_id: "YOUR-PROJECT-ID"
});

app.use(bodyParser.json());

app.post("/upload", upload.single("myImage"), function(req, res) {
  client

    .webDetection(req.file.buffer)
    .then(results => {
        console.log(results[0].textAnnotations);
      const webDetection = results[0].webDetection;
      var identification = { breed: webDetection.webEntities[0].description };
      console.log(webDetection.webEntities[0].description);
      res.json(webDetection.webEntities[0].description);
    })
    .catch(err => {
      console.error("ERROR:", err);
    });
});


app.post("/uploadText", upload.single("myImage"), function(req, res) {
  client
    .textDetection(req.file.buffer)
    .then(results => {
      console.log(results);
      const textDetection = results[0].fullTextAnnotation;
      var identificationT = { breed: textDetection.text};
      console.log(textDetection.text);
      res.json(textDetection.text);
    })
    .catch(err => {
      console.error("ERROR:", err);
    });
});

app.listen(3001);

Woah, that's a lot of code! Make sure you have the 🔑 key.json file in your root directory and that you replace YOUR-PROJECT-ID with the project id of the project you created earlier.You can find it in your key.json file under the project_id field.

Now, let's understand what this code is doing.

Each function here creates an Express server that handles all of the POST requests to the Google Vision API via the route /upload and /uploadText. The POST request is uploading something called myImage, which will contain the data of the images you'll be uploading to the app later. The request returns a response JSON that is parsed and any errors are handled when necessary. To start this server, we'll run node server.js later, and the server will listen at port 3001 for any changes. There are two POST functions to handle the requests for two different features: label detection and OCR.

  1. Now, go back to your App.js file and replace the code in the return statement of your render function with this:
    <div>
        <header>
          <h1>Google Cloud Vision Image Identifier</h1>
        </header>
        <div className="uploadForm">
          <h3>Upload Image</h3>
          <input
            type="file"
          />
          <button className="button" onClick={this.postImg}>Click to identify</button>
        </div>

        <h3 style={{ marginTop: 50 }}>{this.state.identification}</h3>
      <div className="uploadForm">
        <h3 style={{ marginTop: 50 }}> Upload text Image</h3>
        <input
          type="file"
        />
        <button className="button" onClick={this.postImgText}>Click to identify</button>
      </div>

      <h3 style={{ marginTop: 50, marginBottom: 20 }}>{this.state.identificationT}</h3>
    </div>

These are just some basic HTML components to prepare to allow you to upload photos. Nothing too complicated. You can refresh your browser tab to see the changes if they're not getting displayed.

  1. Above the render function, add the following functions:
  postImg = () => {
    var formData = new FormData();
    formData.append("myImage", this.state.image);
    fetch("/upload", {
      method: "POST",
      body: formData
    })
      .then(response => {
          if(response.status === 200) {return response.json();}
          else {
            console.log(response);
            return { error: 'there was an error with response' }
          }
      }).then(response => {
        if(response.error) { console.log(response); }
        else {

          this.setState({ identification: response });
        }
      });
  };

  postImgText = () => {
    var formData = new FormData();
    formData.append("myImage", this.state.imageT);
    fetch("/uploadText", {
      method: "POST",
      body: formData
    })
      .then(response => {
          if(response.status === 200) {return response.json();}
          else {
            console.log(response);
            return { error: 'there was an error with response' }
          }
      }).then(response => {
        if(response.error) { console.log(response); }
        else {

          this.setState({ identificationT: response });
        }
      });
  };

Uh oh, more HTTP requests 😰. This function actually isn't that complicated when you take another look at it. Can you guess what it does? postImg and postImgText would be presumably called on some action and they take in the current state of the uploaded image, convert it to an HTTP request-friendly format, and send it via the response. The functions then handle the response, either updating the state identification or console logging the error.

  1. Now lets go back to the render function. You'll notice that the calls to change the state of image and imageT aren't defined. Can you figure out how to add them in?
  • Hint 1: when should the state change be called?

  • Hint 2: set the state of image and imageT to e.target.files[0].

If you can't figure it out, scroll to the bottom of the README for the answer 😅.

Great, now you should have all the code you'll need to run the app!

Putting it all together

Run npm start again in the root directory of your repo or refresh your browser tag. This is what you should see:

Now, open a second tab in your terminal and cd into the server directory. Once there, run node server.js. This will run the server concurrently with your web app. Remember, the server handles all of your requests to the Google Vision API and needs to run in order for your images to be processed correctly.

Navigate back to the browser tab and click on the first button. Upload any image (if you need an image, feel free to use the image of the dog we included earlier – it's in the Rimg directory in the repo). Then, click Click to identify. Voilà, after a few seconds, a tag will appear below the input fields, effectively identifying your image! This is the output you should see in your terminal running the Express server:

The words that show up are the labels the Google Vision API is assigning to your image. Play around with different images and the other button that handles images with text (OCR) to get a feel for the API.

Extra Credit ⭐

We've already added two buttons for you: Label Detection and OCR. Add another button for face recognition and get it to work for additional bonus points! (For hints, look at the server.js code again and check out the Google Vision API source code for reference).

Summary / What you Learned

  • React recap
  • Implementing Google Cloud Vision API
  • Intro to APIs
  • Intro to Express

What to turn in

  • The URL to your forked repo
  • A screen shot of the face recognition button working for you if you attempted the extra credit

Sources

  • This tutorial was based off of this repo created by brianhallerweb on GitHub.

Resources

Answer to state change question

<input type="file" onChange={e => this.setState({ image: e.target.files[0] })} />

<input type="file" onChange={e => this.setState({ imageT: e.target.files[0] })} />

About

workshop-ws-5-1-apis-google-cloud-vision-api created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •