利用LlamaIndex和Gemini 2.0構建財務報告檢索系統

利用LlamaIndex和Gemini 2.0構建財務報告檢索系統

財務報告對於評估公司的健康狀況至關重要。它們長達數百頁,很難有效地提取具體的見解。分析師和投資者要花費數小時翻閱資產負債表、損益表和腳註,只為回答一些簡單的問題,如:2024 年公司的收入是多少?隨著 LLM 模型和向量搜尋技術的最新進展,我們可以使用 LlamaIndex 和相關框架自動進行財務報告分析。這篇博文將探討我們如何使用 LlamaIndex、ChromaDB、Gemini2.0 和 Ollama 構建一個強大的財務 RAG 系統,精確地回答來自冗長報告的查詢。

學習目標

  • 瞭解高效分析對財務報告檢索系統的需求。
  • 瞭解如何使用 LlamaIndex 對財務報告進行預處理和向量化。
  • 探索 ChromaDB,為文件檢索構建強大的向量資料庫。
  • 使用 Gemini 2.0 和 Llama 3.2 為金融資料分析實施查詢引擎。
  • 使用 LlamaIndex 探索高階查詢路由技術,以增強洞察力。

為什麼需要財務報告檢索系統?

財務報告包含有關公司業績的重要資訊,包括收入、支出、負債和盈利能力。然而,這些報告篇幅巨大、冗長,而且充滿專業術語,分析師、投資者和高管手動提取相關資訊非常耗時。

財務報告檢索系統可通過自然語言查詢實現這一過程的自動化。使用者可以簡單地提出 “2023 年的收入是多少?”或 “總結一下 2023 年的流動性問題”等問題,而無需搜尋 PDF 檔案系統會快速檢索並總結相關部分,從而節省人工操作的時間。

專案實施

要實施專案,我們首先需要設定環境並安裝所需的庫:

步驟 1:設定環境

首先,我們將建立一個用於開發工作的 conda 環境。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$conda create --name finrag python=3.12
$conda activate finrag
$conda create --name finrag python=3.12 $conda activate finrag
$conda create --name finrag python=3.12
$conda activate finrag

步驟 2:安裝必要的Python庫

安裝 libraires 是任何專案實施的關鍵步驟:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$pip install llama-index llama-index-vector-stores-chroma chromadb
$pip install llama-index-llms-gemini llama-index-llms-ollama
$pip install llama-index-embeddings-gemini llama-index-embeddings-ollama
$pip install python-dotenv nest-asyncio pypdf
$pip install llama-index llama-index-vector-stores-chroma chromadb $pip install llama-index-llms-gemini llama-index-llms-ollama $pip install llama-index-embeddings-gemini llama-index-embeddings-ollama $pip install python-dotenv nest-asyncio pypdf
$pip install llama-index llama-index-vector-stores-chroma chromadb
$pip install llama-index-llms-gemini llama-index-llms-ollama
$pip install llama-index-embeddings-gemini llama-index-embeddings-ollama
$pip install python-dotenv nest-asyncio pypdf

步驟 3:建立專案目錄

現在建立一個專案目錄,並建立一個名為 .env 的檔案,在該檔案中放入所有 API 金鑰,以便安全管理 API 金鑰。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# on .env file
GOOGLE_API_KEY="<your-api-key>"
# on .env file GOOGLE_API_KEY="<your-api-key>"
# on .env file
GOOGLE_API_KEY="<your-api-key>"

我們從 .env 檔案載入環境變數,以安全地儲存敏感的 API 金鑰。這將確保我們的雙子座應用程式介面(Gemini API)或谷歌應用程式介面(Google API)始終受到保護。

我們將使用 Jupyter Notebook 完成專案。建立一個 Jupyter Notebook 檔案,然後開始逐步實施。

步驟 4:載入API金鑰

現在,我們將載入下面的 API 金鑰:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import os
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GOOGLE_API_KEY")
# Only to check .env is accessing properly or not.
# print(f"GEMINI_API_KEY: {GEMINI_API_KEY}")
import os from dotenv import load_dotenv load_dotenv() GEMINI_API_KEY = os.getenv("GOOGLE_API_KEY") # Only to check .env is accessing properly or not. # print(f"GEMINI_API_KEY: {GEMINI_API_KEY}")
import os
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GOOGLE_API_KEY")
# Only to check .env is accessing properly or not.
# print(f"GEMINI_API_KEY: {GEMINI_API_KEY}")

現在,我們的環境已經準備就緒,可以進入下一個最重要的階段了。

使用Llamaindex處理檔案

從 AnnualReports 網站收集賽車遊戲公司的財務報告。

點選此處下載。

第一頁看起來像

賽車遊戲公司的財務報告

Source: Report

這些報告總共有 123 頁,但我只需將報告中的財務報表提取出來,然後為我們的專案建立一個新的 PDF。

我是怎麼做的呢?使用 PyPDF 庫非常簡單。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from pypdf import PdfReader
from pypdf import PdfWriter
reader = PdfReader("NASDAQ_MSGM_2023.pdf")
writer = PdfWriter()
# page 66 to 104 have financial statements.
page_to_extract = range(66, 104)
for page_num in page_to_extract:
writer.add_page(reader.pages[page_num])
output_pdf = "Motorsport_Games_Financial_report.pdf"
with open(output_pdf, "wb") as outfile:
writer.write(output_pdf)
print(f"New PDF created: {output_pdf}")
from pypdf import PdfReader from pypdf import PdfWriter reader = PdfReader("NASDAQ_MSGM_2023.pdf") writer = PdfWriter() # page 66 to 104 have financial statements. page_to_extract = range(66, 104) for page_num in page_to_extract: writer.add_page(reader.pages[page_num]) output_pdf = "Motorsport_Games_Financial_report.pdf" with open(output_pdf, "wb") as outfile: writer.write(output_pdf) print(f"New PDF created: {output_pdf}")
from pypdf import PdfReader
from pypdf import PdfWriter
reader = PdfReader("NASDAQ_MSGM_2023.pdf")
writer = PdfWriter()
# page 66 to 104 have financial statements.
page_to_extract = range(66, 104)
for page_num in page_to_extract:
writer.add_page(reader.pages[page_num])
output_pdf = "Motorsport_Games_Financial_report.pdf"
with open(output_pdf, "wb") as outfile:
writer.write(output_pdf)
print(f"New PDF created: {output_pdf}")

新報告檔案只有 38 頁,這有助於我們快速嵌入檔案。

載入和分割財務報告

在專案資料目錄中,放入新建立的 Motorsport_Games_Financial_report.pdf 檔案,該檔案將為專案編制索引。

財務報告通常為 PDF 格式,包含大量表格資料、腳註和法律宣告。我們使用 LlamaIndex 的 SimpleDirectoryReader 來載入這些檔案並將其轉換為文件。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
from llama_index.core import SimpleDirectoryReader documents = SimpleDirectoryReader("./data").load_data()
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()

由於報告的篇幅非常大,無法作為單個文件進行處理,因此我們將其分割成較小的塊或節點。每個小塊對應一個頁面或部分,有助於更有效地檢索。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from copy import deepcopy
from llama_index.core.schema import TextNode
def get_page_nodes(docs, separator="\n---\n"):
"""Split each document into page node, by separator."""
nodes = []
for doc in docs:
doc_chunks = doc.text.split(separator)
for doc_chunk in doc_chunks:
node = TextNode(
text=doc_chunk,
metadata=deepcopy(doc.metadata),
)
nodes.append(node)
return nodes
from copy import deepcopy from llama_index.core.schema import TextNode def get_page_nodes(docs, separator="\n---\n"): """Split each document into page node, by separator.""" nodes = [] for doc in docs: doc_chunks = doc.text.split(separator) for doc_chunk in doc_chunks: node = TextNode( text=doc_chunk, metadata=deepcopy(doc.metadata), ) nodes.append(node) return nodes
from copy import deepcopy
from llama_index.core.schema import TextNode
def get_page_nodes(docs, separator="\n---\n"):
"""Split each document into page node, by separator."""
nodes = []
for doc in docs:
doc_chunks = doc.text.split(separator)
for doc_chunk in doc_chunks:
node = TextNode(
text=doc_chunk,
metadata=deepcopy(doc.metadata),
)
nodes.append(node)
return nodes

要了解檔案提取過程,請參閱下圖。

檔案提取過程

現在,我們的財務資料已經準備好進行向量化和儲存以備檢索。

使用ChromaDB建立向量資料庫

我們將使用 ChromaDB 快速、準確地建立本地向量資料庫。我們的金融文字嵌入式表示法將儲存到 ChromaDB 中。

我們將初始化向量資料庫,並使用 Ollama 配置 Nomic-embed-text 模型,以生成本地嵌入。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import chromadb
from llama_index.llms.gemini import Gemini
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import Settings
embed_model = OllamaEmbedding(model_name="nomic-embed-text")
chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("financial_collection")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
import chromadb from llama_index.llms.gemini import Gemini from llama_index.embeddings.ollama import OllamaEmbedding from llama_index.vector_stores.chroma import ChromaVectorStore from llama_index.core import Settings embed_model = OllamaEmbedding(model_name="nomic-embed-text") chroma_client = chromadb.PersistentClient(path="./chroma_db") chroma_collection = chroma_client.get_or_create_collection("financial_collection") vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
import chromadb
from llama_index.llms.gemini import Gemini
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import Settings
embed_model = OllamaEmbedding(model_name="nomic-embed-text")
chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("financial_collection")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

最後,我們使用 LLamaIndex 的 VectorStoreIndex 建立一個向量索引。該索引將我們的向量資料庫與 LlamaIndex 的查詢引擎連線起來。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from llama_index.core import VectorStoreIndex, StorageContext
storage_context = StorageContext.from_defaults(vector_store=vector_store)
vector_index = VectorStoreIndex.from_documents(documents=documents, storage_context=storage_context, embed_model=embed_model)
from llama_index.core import VectorStoreIndex, StorageContext storage_context = StorageContext.from_defaults(vector_store=vector_store) vector_index = VectorStoreIndex.from_documents(documents=documents, storage_context=storage_context, embed_model=embed_model)
from llama_index.core import VectorStoreIndex, StorageContext
storage_context = StorageContext.from_defaults(vector_store=vector_store)
vector_index = VectorStoreIndex.from_documents(documents=documents, storage_context=storage_context, embed_model=embed_model)

上述程式碼將使用 nomic-embed-text 從金融文字檔案中建立向量索引。這需要時間,具體取決於本地系統的規範。

索引建立完成後,您就可以在必要時重複使用嵌入的程式碼,而無需重新建立索引。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
vector_index = VectorStoreIndex.from_vector_store(
vector_store=vector_store, embed_model=embed_model
)
vector_index = VectorStoreIndex.from_vector_store( vector_store=vector_store, embed_model=embed_model )
vector_index = VectorStoreIndex.from_vector_store(
vector_store=vector_store, embed_model=embed_model
)

這將允許你使用儲存中的 chromadb 嵌入檔案。

現在,我們的過載工作已經完成,是時候查詢報告並放鬆一下了。

使用Gemini 2.0查詢財務資料

一旦我們的財務資料建立了索引,我們就可以提出自然語言問題並得到準確的答案。我們將使用 Gemini-2.0 Flash 模型進行查詢,該模型可與我們的向量資料庫互動,獲取相關部分並生成有見地的回覆。

設定Gemini 2.0

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from llama_index.llms.gemini import Gemini
llm = Gemini(api_key=GEMINI_API_KEY, model_name="models/gemini-2.0-flash")
from llama_index.llms.gemini import Gemini llm = Gemini(api_key=GEMINI_API_KEY, model_name="models/gemini-2.0-flash")
from llama_index.llms.gemini import Gemini
llm = Gemini(api_key=GEMINI_API_KEY, model_name="models/gemini-2.0-flash")

使用帶有向量索引的Gemini 2.0啟動查詢引擎

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
query_engine = vector_index.as_query_engine(llm=llm, similarity_top_k=5)
query_engine = vector_index.as_query_engine(llm=llm, similarity_top_k=5)
query_engine = vector_index.as_query_engine(llm=llm, similarity_top_k=5)

示例查詢和響應

下面是多個查詢和不同的響應:

查詢-1

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query("what is the revenue of on 2022 Year Ended December 31?")
print(str(response))
response = query_engine.query("what is the revenue of on 2022 Year Ended December 31?") print(str(response))
response = query_engine.query("what is the revenue of on 2022 Year Ended December 31?")
print(str(response))

響應基於報告的查詢響應1

來自報告的相應圖片:

基於報告的查詢響應1的原資料出處

查詢-2

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"what is the Net Loss Attributable to Motossport Games Inc. on 2022 Year Ended December 31?"
)
print(str(response))
response = query_engine.query( "what is the Net Loss Attributable to Motossport Games Inc. on 2022 Year Ended December 31?" ) print(str(response))
response = query_engine.query(
"what is the Net Loss Attributable to Motossport Games Inc. on 2022 Year Ended December 31?"
)
print(str(response))

響應

基於報告的查詢響應2

來自報告的相應圖片:

基於報告的查詢響應2的原資料出處

查詢-3

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"What are the Liquidity and Going concern for the Company on December 31, 2023"
)
print(str(response))
response = query_engine.query( "What are the Liquidity and Going concern for the Company on December 31, 2023" ) print(str(response))
response = query_engine.query(
"What are the Liquidity and Going concern for the Company on December 31, 2023"
)
print(str(response))

響應

基於報告的查詢響應3

查詢-4

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"Summarise the Principal versus agent considerations of the company?"
)
print(str(response))
response = query_engine.query( "Summarise the Principal versus agent considerations of the company?" ) print(str(response))
response = query_engine.query(
"Summarise the Principal versus agent considerations of the company?"
)
print(str(response))

響應

基於報告的查詢響應4

來自報告的相應圖片:

基於報告的查詢響應4原資料出處

查詢-5

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"Summarise the Net Loss Per Common Share of the company with financial data?"
)
print(str(response))
response = query_engine.query( "Summarise the Net Loss Per Common Share of the company with financial data?" ) print(str(response))
response = query_engine.query(
"Summarise the Net Loss Per Common Share of the company with financial data?"
)
print(str(response))

響應

基於報告的查詢響應5

來自報告的相應圖片:

基於報告的查詢響應5原資料出處

查詢-6

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"Summarise Property and equipment consist of the following balances as of December 31, 2023 and 2022 of the company with financial data?"
)
print(str(response))
response = query_engine.query( "Summarise Property and equipment consist of the following balances as of December 31, 2023 and 2022 of the company with financial data?" ) print(str(response))
response = query_engine.query(
"Summarise Property and equipment consist of the following balances as of December 31, 2023 and 2022 of the company with financial data?"
)
print(str(response))

響應

基於報告的查詢響應6

來自報告的相應圖片:

基於報告的查詢響應6原資料出處

查詢-7

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"Summarise The Intangible Assets on December 21, 2023 of the company with financial data?"
)
print(str(response))
response = query_engine.query( "Summarise The Intangible Assets on December 21, 2023 of the company with financial data?" ) print(str(response))
response = query_engine.query(
"Summarise The Intangible Assets on December 21, 2023 of the company with financial data?"
)
print(str(response))

響應

基於報告的查詢響應7

查詢-8

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = query_engine.query(
"What are leases of the company with yearwise financial data?"
)
print(str(response))
response = query_engine.query( "What are leases of the company with yearwise financial data?" ) print(str(response))
response = query_engine.query(
"What are leases of the company with yearwise financial data?"
)
print(str(response))

響應

基於報告的查詢響應8

來自報告的相應圖片:

基於報告的查詢響應8原資料出處

使用Llama 3.2進行本地查詢

在本地利用 Llama 3.2 查詢財務報告,而無需依賴基於雲的模型。

設定Llama 3.2:1b

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
local_llm = Ollama(model="llama3.2:1b", request_timeout=1000.0)
local_query_engine = vector_index.as_query_engine(llm=local_llm, similarity_top_k=3)
local_llm = Ollama(model="llama3.2:1b", request_timeout=1000.0) local_query_engine = vector_index.as_query_engine(llm=local_llm, similarity_top_k=3)
local_llm = Ollama(model="llama3.2:1b", request_timeout=1000.0)
local_query_engine = vector_index.as_query_engine(llm=local_llm, similarity_top_k=3)

查詢-9

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = local_query_engine.query(
"Summary of chart of Accrued expenses and other liabilities using the financial data of the company"
)
print(str(response))
response = local_query_engine.query( "Summary of chart of Accrued expenses and other liabilities using the financial data of the company" ) print(str(response))
response = local_query_engine.query(
"Summary of chart of Accrued expenses and other liabilities using the financial data of the company"
)
print(str(response))

響應

基於報告的查詢響應9

來自報告的相應圖片:

基於報告的查詢響應9原資料出處

使用LlamaIndex進行高階查詢路由選擇

有時,我們既需要詳細的檢索,也需要總結性的見解。我們可以通過結合向量索引和摘要索引來實現這一點。

  • 向量索引用於精確的文件檢索
  • 摘要索引用於簡明的財務摘要

我們已經建立了向量索引,現在我們將建立一個摘要索引,使用分層方法來總結財務報表。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from llama_index.core import SummaryIndex
summary_index = SummaryIndex(nodes=page_nodes)
from llama_index.core import SummaryIndex summary_index = SummaryIndex(nodes=page_nodes)
from llama_index.core import SummaryIndex
summary_index = SummaryIndex(nodes=page_nodes)

然後整合 RouterQueryEngine,它可根據查詢型別有條件地決定是從摘要索引還是向量索引。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.tools import QueryEngineTool from llama_index.core.query_engine.router_query_engine import RouterQueryEngine from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

現在建立摘要查詢引擎

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
summary_query_engine = summary_index.as_query_engine(
llm=llm, response_mode="tree_summarize", use_async=True
)
summary_query_engine = summary_index.as_query_engine( llm=llm, response_mode="tree_summarize", use_async=True )
summary_query_engine = summary_index.as_query_engine(
llm=llm, response_mode="tree_summarize", use_async=True
)

該摘要查詢引擎將被整合到摘要工具中,而向量查詢引擎將被整合到向量工具中。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Creating summary tool
summary_tool = QueryEngineTool.from_defaults(
query_engine=summary_query_engine,
description=(
"Useful for summarization questions related to Motorsport Games Company."
),
)
# Creating vector tool
vector_tool = QueryEngineTool.from_defaults(
query_engine=query_engine,
description=(
"Useful for retriving specific context from the Motorsport Games Company."
),
)
# Creating summary tool summary_tool = QueryEngineTool.from_defaults( query_engine=summary_query_engine, description=( "Useful for summarization questions related to Motorsport Games Company." ), ) # Creating vector tool vector_tool = QueryEngineTool.from_defaults( query_engine=query_engine, description=( "Useful for retriving specific context from the Motorsport Games Company." ), )
# Creating summary tool
summary_tool = QueryEngineTool.from_defaults(
query_engine=summary_query_engine,
description=(
"Useful for summarization questions related to Motorsport Games Company."
),
)
# Creating vector tool
vector_tool = QueryEngineTool.from_defaults(
query_engine=query_engine,
description=(
"Useful for retriving specific context from the Motorsport Games Company."
),
)

這兩種工具都已完成,現在我們通過路由器將這些工具連線起來,這樣當查詢通過路由器時,路由器就會通過分析使用者查詢來決定使用哪種工具。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Router Query Engine
adv_query_engine = RouterQueryEngine(
llm=llm,
selector=LLMSingleSelector.from_defaults(llm=llm),
query_engine_tools=[summary_tool, vector_tool],
verbose=True,
)
# Router Query Engine adv_query_engine = RouterQueryEngine( llm=llm, selector=LLMSingleSelector.from_defaults(llm=llm), query_engine_tools=[summary_tool, vector_tool], verbose=True, )
# Router Query Engine
adv_query_engine = RouterQueryEngine(
llm=llm,
selector=LLMSingleSelector.from_defaults(llm=llm),
query_engine_tools=[summary_tool, vector_tool],
verbose=True,
)

我們的高階查詢系統已全部安裝完畢,現在可查詢我們新推出的高階查詢引擎。

查詢-10

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = adv_query_engine.query(
"Summarize the charts describing the revenure of the company."
)
print(str(response))
response = adv_query_engine.query( "Summarize the charts describing the revenure of the company." ) print(str(response))
response = adv_query_engine.query(
"Summarize the charts describing the revenure of the company."
)
print(str(response))

響應

基於報告的查詢響應10原資料出處

您可以看到,我們的智慧路由器會決定使用摘要工具,因為使用者在查詢中要求摘要。

查詢-11

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
response = adv_query_engine.query("What is the Total Assets of the company Yearwise?")
print(str(response))
response = adv_query_engine.query("What is the Total Assets of the company Yearwise?") print(str(response))
response = adv_query_engine.query("What is the Total Assets of the company Yearwise?")
print(str(response))

響應

基於報告的查詢響應12

在這裡,路由器選擇了向量工具,因為使用者詢問的是具體資訊,而不是摘要。

本文使用的所有程式碼都在這裡

小結

我們可以利用 LlamaIndex、ChromaDB 和高階 LLM 高效分析財務報告。該系統可實現自動財務洞察、實時查詢和強大的彙總功能。這類系統使財務分析更方便、更高效,從而在投資、交易和經營過程中做出更好的決策。

  • 由 LLM 驅動的文件檢索系統可大幅減少分析複雜財務報告所花費的時間。
  • 使用雲和本地 LLM 的混合方法可確保系統設計的成本效益、隱私保護和靈活性。
  • LlamaIndex 的模組化框架可以輕鬆實現財務報告整理工作流程的自動化。
  • 這類系統可適用於法律檔案、醫療報告和監管備案等不同領域,因此是一種通用的 RAG 解決方案。

常見問題

Q1. 系統如何處理不同的財務報告?

A. 系統設計用於處理任何結構化的財務檔案,將其分解為文字塊,嵌入並儲存在 ChromaDB 中。新報告可以動態新增,無需重新建立索引。

Q2. 能否將其擴充套件到生成財務圖表和視覺化效果?

A. 可以,通過整合 Matplotlib、Pandas 和 Streamlit,您可以將收入增長、淨虧損分析或資產分佈等趨勢視覺化。

Q3. 查詢路由系統如何提高準確性?

A. RouterQueryEngine 會自動檢測查詢是否需要彙總響應或特定的財務資料檢索。這樣可以減少不相關的輸出,確保回覆的準確性。

Q4. 該系統是否適用於實時財務分析?

A. 可以,但這取決於向量儲存更新的頻率。您可以使用 OpenAI 嵌入式應用程式介面(API)持續攝取管道,動態查詢實時財務報告。

評論留言