【Vertex AI】 Gemini APIをPythonで実行する

2024年5月30日

この記事では，Vertex AI Gemini APIをPythonコードから実行する手順を示してます．

検証環境

Windows 11
Python 3.12.1

事前準備

Google Could Consoleの設定

Google Could Consoleからプロジェクトを作成（もしくは作成済みプロジェクトを選択）します．

下記よりプロジェクトを作成＆選択できます．

https://console.cloud.google.com/projectselector2/home/dashboard?hl=ja

Vertex AI APIを有効化します．

https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com&hl=ja

APIを利用するには選択したプロジェクトで課金が有効になっている必要があります．

gcloud CLIのインストール

APIを利用可能にするためgcloud CLIをインストールしてアカウント認証します．

インストール手順が少し長いので別記事にしています．下記をご参照ください．

APIを利用するサンプルコード

Vertex AI SDKのインストールとgcloud CLIの認証を行います．


pip install --upgrade google-cloud-aiplatform
gcloud auth application-default login

Vertex AIによるGemini APIの料金はこちらを参照ください．

テキストプロンプト

テキストプロンプトでは一般的に次の用途で利用されます．

分類：不正行為の検出，迷惑メールフィルタ，感情分析など
要約：テキストを要約，記事や商品説明のコンテンツの生成など
抽出：固有表現抽出，関係の抽出，質問応答など

サンプルコード

gemini_text.py


import vertexai
from vertexai.generative_models import GenerativeModel


def generate_text(prompt: str, model_name: str = "gemini-1.0-pro") -> str:
    model = GenerativeModel(model_name)
    response = model.generate_content([prompt])
    return response.text


def main():
    vertexai.init(project="xxx", location="us-south1")
    print(generate_text(prompt="""GPT4とGeminiの違いについて簡潔に説明して"""))


if __name__ == "__main__":
    main()

実行例


python gemini_text.py
## GPT-4 と Gemini の簡潔な違い

**GPT-4:**

* Google によって開発された大規模言語モデル (LLM)。
* 1750 億のパラメータを持つ。
* 言語タスクにおいて高い性能を発揮する。
* 情報は 2022 年までのものに基づいている。
* 詳細情報は公開されていない。

**Gemini:**

* Google によって開発された大規模言語モデル (LLM)。
* 詳細なパラメータ数は非公開。
* 言語タスクに加え、より幅広いタスクに対応できるよう設計されている。
* 情報は 2023 年 11 月までのものに基づいている。
* 公開情報をもとに訓練されているため、一部の情報は GPT-4 より新しい。

**主な違い:**

* パラメータ数：GPT-4 は 1750 億パラメータ、Gemini は非公開。
* 開発目的：GPT-4 は言語タスクに特化、Gemini はより幅広いタスクに対応。
* 情報の鮮度：GPT-4 は 2022 年、Gemini は 2023 年 11 月までの情報に基づく。
* 情報の公開性：GPT-4 は詳細非公開、Gemini は公開情報に基づく訓練。

## まとめ

GPT-4 と Gemini はどちらも Google によって開発された大規模言語モデルです。GPT-4 は言語タスクに特化しており、より高い性能を発揮しますが、情報が古くなっています。Gemini はより幅広いタスクに対応でき、情報が新しいですが、性能は GPT-4 に劣る可能性があります。

どちらのモデルが優れているかは、目的や用途によって異なります。

利用可能なロケーションはこちらから確認してください．

チャットプロンプト

チャットプロンプトは，モデルがチャットの会話の履歴を追跡しその履歴を回答のコンテキストに使用します．これはMulti-turn Promptと呼ばれます．（上述のテキストプロンプトはSingle-turn Promptとも呼ばれます）

サンプルコード

gemini_chat.py


import vertexai
from vertexai.generative_models import GenerativeModel


def main():
    vertexai.init(project="xxx", location="us-south1")
    model_name = "gemini-1.0-pro"
    model = GenerativeModel(model_name)
    chat = model.start_chat()
    query = """GPT4とGeminiの違いについて簡潔に説明して"""
    print(query)
    print(
        chat.send_message(
            [query],
        ).text
    )
    query = """その回答を英語に翻訳して"""
    print(query)
    print(
        chat.send_message(
            [query],
        ).text
    )



if __name__ == "__main__":
    main()

実行例


python gemini_chat.py
GPT4とGeminiの違いについて簡潔に説明して
##  GPT-4とGeminiの違い

GPT-4とGeminiはどちらも、Googleによって開発された大規模言語モデルです。しかし、それぞれの特徴にはいくつかの違いがあります。

**主な違い**

* **データ:** GPT-4は2022年9月までに公開された1.56兆語のテキストデータで訓練されています。 一方、Geminiはより新しいデータに基づいて訓練されており、正確なデータ量は公開されていません。
* **モデルサイズ:** GPT-4は1750億のパラメータを持っています。 Geminiはより小型のモデルを採用しており、3450億のパラメータを持つモデルと、250億のパラメータを持つモデルの2種類が存在します。
* **機能:** どちらも、文章の生成、翻訳、要約など、さまざまなタスクを実行できます。 しかし、Geminiはより新しいモデルであるため、最新の情報や技術を取り入れたタスクに適しているかもしれません。
* **ユーザーインターフェース:** GPT-4はAPIを通じてアクセス可能ですが、Geminiは対話型のボットとして実装されており、より自然なやり取りが可能です。

**その他の違い**

* **安全性:** GPT-4は、特に有害性のあるテキスト生成の可能性を指摘されています。 Geminiは安全性をより重視した設計になっている可能性があります。
* **開発状況:** GPT-4はすでに広く公開されていますが、Geminiはまだ開発中のモデルです。

**結論:**

GPT-4とGeminiはどちらも優れていますが、それぞれの用途や目的に応じて使い分けることが重要です。

## 補足情報

どちらも大規模言語モデルであるため、常に進化しています。  最新の情報は公式情報を確認することをおすすめします。
その回答を英語に翻訳して
## GPT-4 and Gemini's Differences: A Concise Explanation

GPT-4 and Gemini are both large language models developed by Google. However, they have several key differences:

**Main Differences**:

* **Data**: GPT-4 is trained on 1.56 trillion words of text data published by September 2022. Gemini is trained on more recent data, although the exact amount is not publicly disclosed.
* **Model Size**: GPT-4 has 175 billion parameters. Gemini offers two models: one with 345 billion parameters and another with 250 billion parameters.
* **Functionality**: Both can perform various tasks like text generation, translation, and summarization. However, Gemini, being a newer model, might be more suitable for tasks using the latest information and technologies.   
* **User Interface**: GPT-4 is accessible through an API, while Gemini is implemented as an interactive chatbot, enabling more natural interactions.

**Other Differences**:

* **Safety**: GPT-4 has raised concerns about potentially harmful text generation. Gemini might be designed with a greater emphasis on safety.
* **Development Status**: GPT-4 is already widely available, while Gemini is still under development.

**Conclusion**:

While both GPT-4 and Gemini excel, choosing the right one depends on your specific needs and goals.

## Additional Information

Both models are constantly evolving, so it's best to check official sources for the latest information.

よかったらシェアしてね！