AI-Powered Unstructured Data Processing
Vast amounts of valuable data exist in unstructured formats, such as articles, reports, legal documents, and web content. For blockchain applications, this unstructured data holds significant potential, but it must first be transformed into a structured format that can be interpreted and acted upon by smart contracts. Pulling unstructured data and publishing it on-chain as structured data is crucial for unlocking new real-world use cases—ranging from regulatory compliance and price feeds to election monitoring and contract automation. Without this transformation, blockchains remain disconnected from critical, real-time insights available only in unstructured formats.
To bridge this gap, IntelliX leverages Large Language Models (LLMs) to transform unstructured data into actionable, structured formats. These models, integrated within the Data Processor Node, allow developers to fetch data from various unstructured sources, such as news websites or legal texts, and convert it into structured data ready for blockchain publishing. LLMs handle the complexities of interpreting free-form text, reconciling discrepancies, and ensuring that the final output adheres to a user-defined schema, making it fit for on-chain use.
Below is a sample configuration demonstrating how IntelliX can extract the winner of an election from multiple news sources, process the unstructured data using an LLM, and format the result in a structured JSON schema:
Aggregation Logic, Validation, and Data Trustworthiness
To ensure the trustworthiness of the data, multiple nodes in the IntelliX network process the same source data. Typically, responses from LLMs are not deterministic—due to the model’s inherent randomness in choosing the next token. This variability poses a challenge for oracle networks, which require consistency across all nodes. To address this, IntelliX minimizes randomness by selecting the next token with the highest probability during inference, ensuring that each node produces the same result.
In addition to minimizing randomness, IntelliX introduces validation logic to reconcile data across nodes. The validation logic, defined by the user, allows for additional checks to ensure data consistency and accuracy. For example, the validation section might include a threshold rule, where a minimum percentage of nodes must return matching results for the data to be considered valid. In the configuration above, the threshold is set to 80%, meaning that 80% of the nodes must produce the same result for it to pass validation.
Last updated