Skip the Cloud, Keep the Power: Real-Time AI with Conduit and Llama

The latest Conduit release, v0.13.5, has introduced a new builtin processor, the Ollama processor. This processor provides the capability to enhance data in Conduit pipelines by sending a prompt to a specified large language model(LLM). By sending prompts to a self-hosted model through Ollama, users can perform data transformation directly in their pipeline.

In this post, we will explore what Ollama is and work through some examples of how the Ollama processor can be used for data processing within Conduit.

What is Ollama?

Ollama is a self-hosted tool that provides the ability to create a bridge between a machine and a LLM. The ability to self-host gives the advantages such as:

Data Privacy - All data remains on the user’s machine rather than being routed through the LLM’s parent company.
Availability - There is no dependence of the availability of the LLM’s parent company

Ollama empowers developers with greater flexibility and autonomy, eliminating the need to depend on third-party services.

Examples

Let’s walk through a few examples of the capabilities of the Ollama processor within conduit pipelines.

Extrapolate missing data

Imagine that I am a trying to get information on various books from my old database into a new database with a different table format. My original table follows a schema like the following:

CREATE TABLE authors (
	id SERIAL PRIMARY KEY,
	name VARCHAR(255),
	age INTEGER,
	books VARCHAR(255)
);

However, my new table that I am looking to transfer my data to follows the schema below:

CREATE TABLE author_info (
	id SERIAL PRIMARY KEY,
	first_name VARCHAR(255),
	last_name VARCHAR(255),
	year_of_birth INTEGER,
	books VARCHAR(255)
);

In order to populate the new database, I will need to ask the LLM to separate the given name into a first and last name and perform a lookup of the year of birth.

I created a sample conduit pipeline using the following configuration.

version: 2.2

pipelines:
- id: ollama-author-pipeline
  status: running
  connectors:
    - id: source-connector-1
      type: source
      plugin: builtin:postgres
      settings:
        tables: "authors"
        url: "postgres://username:password@localhost:5433/client1?sslmode=disable"
    - id: dest-connector-1
      type: destination
      plugin: builtin:postgres
      settings:
        url: "postgres://username:password@localhost:5433/client2?sslmode=disable"
        table: "author_info"
  processors:
    - id: ollama-processor-1
      plugin: builtin:ollama
      settings:
        url: "http://127.0.0.1:11434"
        model: "llama3.2"
        prompt: >
          Take the given input, and put the information into a json of the following format:
            {
	            "first_name": "something", 
	            "last_name": "something", 
	            "year_of_birth": "something", 
	            "books": "something"
	           }
          The incoming name is a famous author, so if the author has three names, please determine whether the middle name should be part of thier first name or their last name. You can assume the name field in the input is formated \"firstname lastname\".
          The year_of_birth field should be determined based off of the year the author was born. If that cannot be found, determine based on the current year minus the incoming age of the author.
          The books field will only contain one book, do not return a list."

After running Conduit on this pipeline, all information from my original database is transferred, with the following result.

 id |          name          | age |               books
----+------------------------+-----+-----------------------------------
  1 | Charles Dickens        |  58 | Oliver Twist
  2 | Jane Austin            |  41 | Pride and Prejudice
  3 | Charlotte Bronte       |  38 | Jane Eyre
  4 | Edgar Allan Poe        |  40 | The Raven
  5 | Gabriel Garcia Marquez |  80 | One Hundred Years of Solitude
  6 | Sylvia Plath           |  43 | The Bell Jar
  7 | Arthur Conan Doyle     |  76 | The Adventures of Sherlock Holmes
  8 | Ray Bradberry          |  45 | Fahrenheit 451
  9 | L Frank Baum           |  90 | Wizard of Oz
 10 | Charles Darwin         |  50 | Origin of Species

 id | first_name |  last_name  | year_of_birth |               books
----+------------+-------------+---------------+-----------------------------------
  1 | Charles    | Dickens     |          1812 | Oliver Twist
  2 | Jane       | Austin      |          1775 | Pride and Prejudice
  3 | Charlotte  | Bronté      |          1816 | Jane Eyre
  4 | Edgar      | Allan Poe   |          1809 | The Raven
  5 | Jane       | Austen      |          1815 | Pride and Prejudice
  6 | Sylvia     | Plath       |          1932 | The Bell Jar
  7 | Sir Arthur | Conan Doyle |          1859 | The Adventures of Sherlock Holmes
  8 | Ray        | Bradbury    |          1920 | Fahrenheit 451
  9 | L          | Frank Baum  |          1856 | Wizard of Oz
 10 | charles    | darwin      |          1809 | Origin of Species

This example shows how the Ollama processor transformed the data according to the requirements specified in our prompt. The processor successfully:

Split the full name into first and last name components
Calculated the year of birth based on the provided age
Maintained the original book information

We can see that the data has mostly been properly restructured to match the new schema; however, the model has made a few choices that are not desired (like adding a Sir to Arthur Conan Doyle), so I will need to edit my prompt accordingly.

AI Sentiment Analysis

In this example, I am looking at a cooking blog after I released a new cooking video. I am parsing various comments to determine feedback that I can act on for my next video.

I need to ask the model to copy the comment to my new table, extract any appropriate feedback, and give a sentiment analysis of the comment.

This is my sample Conduit pipeline configuration.

version: 2.2
pipelines:
- id: ollama-feedback
  status: running
  connectors:
    - id: source-connector-1
      type: source
      plugin: builtin:postgres
      settings:
        tables: "blog_comment"
        url: "postgres://username:password@localhost:5433/client1?sslmode=disable"
    - id: dest-connector-1
      type: destination
      plugin: builtin:postgres
      settings:
        url: "postgres://username:password@localhost:5433/client2?sslmode=disable"
        table: "feedback"
  processors:
    - id: ollama-processor-1
      plugin: builtin:ollama
      settings:
        url: "http://127.0.0.1:11434"
        model: "llama3.2"
        prompt: >
          Take the given input, and put the information into the following format:
          {
            "raw_content": "something",
            "feedback": "something",
            "sentiment": "negative",
          }
          The raw_content field should be filled with the information from "content" in the form of a string.            
          The feedback field should contain a string of any information said in the content field that is      
          something that could be done to improve.
          The sentiment field should be a string of either postive, negative, or neutral.

After I have run the pipeline, the state of my data is shown below.

 id |    name    |                              content                              |       creation_date
----+------------+-------------------------------------------------------------------+----------------------------
  1 | username01 | This was so helpful! Thanks!!                                     | 2025-03-28 12:29:08.957372
  2 | username02 | I hated that. Should have been 6 min shorter                      | 2025-03-28 12:29:08.957372
  3 | username03 | Good video. Adding some ginger to the recipe would make it better | 2025-03-28 12:29:08.957372

 id |                            raw_content                            |             feedback             | sentiment
----+-------------------------------------------------------------------+----------------------------------+-----------
  1 | This was so helpful! Thanks!!                                     |                                  | positive
  2 | I hated that. Should have been 6 min shorter                      | Should have been 6 min shorter   | negative
  3 | Good video. Adding some ginger to the recipe would make it better | Adding some ginger to the recipe | positive

Here we can see the processor successfully:

moved the data from comments in original table to feedback in our new table
extracted any feedback from the original comment
determined whether the feedback is positive or negative

Conclusion

The Ollama processor allows users a variety of new possibilities to data processing. Whether cleaning, enriching, or analyzing data, your Conduit pipelines now have a variety of ways to interact with your data.

Please try out the new Ollama processor for your own use cases, and let us know how the results by joining our Discord server.