Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ยฏ\_(ใƒ„)_/ยฏ

Oracle AI database 26ai .. unified hybrid Vector Search is the real deal

Posted by FatDBA on April 11, 2026

If you’ve been following Oracle’s release cadence, you already know that Oracle Database 23ai laid the groundwork for a lot of the AI native stuff … the VECTOR data type, AI Vector Search, Select AI, all of it. But 26ai? Oracle didn’t just iterate. They rebranded, re-architected in places, and packed enough into this release to justify calling it a generational step forward.

Now, there are a lot of features in 26ai. The release covers AI Vector Search, AI for Database Management, AI for Data Development, AI for Application Development, and AI for Analytics …it’s a broad canvas. But I want to focus today on the one feature that I think is most immediately useful to real practitioners doing real work: Unified Hybrid Vector Search.

Oracle AI Vector Search allows you to use simple, intuitive SQL to perform similarity search on vectors and freely combine vectors with relational, text, JSON, and other data types within the same query. You can take complete control of the search accuracy your application requires by specifying the target accuracy as a simple percentage.

Under the hood, there are two index types doing the heavy lifting:

HNSW (Hierarchical Navigable Small World) … the in memory graph based index, fastest option Oracle offers for similarity search.

IVF (Inverted File Index) … better for massive datasets where you can’t hold everything in memory.

And one very important thing 26ai adds that 23ai didn’t fully have: HNSW Index now supports DML on tables that have an HNSW index built on the vector column. Vector search queries using the HNSW Index will see transactionally consistent results based on their read snapshot. Transactional consistency is guaranteed even on Oracle RAC where the HNSW Index is duplicated on all instances in the cluster.

This was a real blocker before. You couldn’t write to a table while it had an HNSW index. Now you can .. and your search results are ACID consistent. That’s production grade behavior, not a poc.

I’m going to walk through a realistic DBA use case. We’re building a searchable error log knowledge base. The goal is to search it semantically … but filter by error category and severity at the sql level simultaneously.

Lets setup the table …

-- Drop and recreate for clean demo
DROP TABLE dba_error_kb PURGE;

CREATE TABLE dba_error_kb (
    id           NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    error_code   VARCHAR2(20),
    error_desc   VARCHAR2(4000),
    category     VARCHAR2(50),    -- e.g. CONNECTIVITY, MEMORY, IO, LOCKING
    severity     NUMBER(1),       -- 1=Low, 2=Medium, 3=High, 4=Critical
    embedding    VECTOR           -- will store the semantic vector
);

Table created.


-- load some date intio tis
INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-12541', 
        'TNS no listener. The listener process is not running on the remote host.', 
        'CONNECTIVITY', 3);

1 row created.


INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-04031', 
        'Unable to allocate memory in the shared pool. Consider increasing SGA_TARGET or flushing the shared pool.',
        'MEMORY', 4);

1 row created.


INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-00060', 
        'Deadlock detected while waiting for resource. A deadlock cycle was found between sessions.',
        'LOCKING', 3);

1 row created.


INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-12154',
        'TNS could not resolve the connect identifier. Check tnsnames.ora or LDAP configuration.',
        'CONNECTIVITY', 2);
1 row created.


INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-01555',
        'Snapshot too old. Rollback segment too small or long-running query encountering wrap-around.',
        'IO', 3);

1 row created.


INSERT INTO dba_error_kb (error_code, error_desc, category, severity)
VALUES ('ORA-00257',
        'Archiver error. Connect internal only until freed. Archive log destination full.',
        'IO', 4);

1 row created.



COMMIT;

Lets generate some embeddings and or this, you need an ONNX embedding model loaded into the database. Oracle ships a compatible model called all_MiniLM_L12_v2 … or you can use their DBMS_VECTOR API to call an external provider (OCI Generative AI, OpenAI, Cohere, etc.). Here’s the in-database ONNX path:

-- Assumes model is already loaded via DBMS_VECTOR.LOAD_ONNX_MODEL
-- Update column with generated embeddings

UPDATE dba_error_kb
SET embedding = vector_embedding(
    all_minilm_l12_v2 
    USING error_desc AS data
);

6 rows updated.


COMMIT;

Commit complete.



SELECT id, error_code, 
       VECTOR_DIMS(embedding)   AS dims,
       VECTOR_NORM(embedding)   AS l2_norm
FROM   dba_error_kb
ORDER BY id;

ID  ERROR_CODE   DIMS  L2_NORM
--  -----------  ----  -------
1   ORA-12541    384   1.0
2   ORA-04031    384   1.0
3   ORA-00060    384   1.0
4   ORA-12154    384   1.0
5   ORA-01555    384   1.0
6   ORA-00257    384   1.0

6 rows selected.

384 dimensions, unit-normalized (L2 norm = 1). Good. That’s the MiniLM signature.

BNow lets build the HNSW Vector Index

CREATE VECTOR INDEX dba_errors_hnsw_idx
ON dba_error_kb (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Index Created.


SQL> COLUMN index_name         FORMAT A22
SQL> COLUMN index_organization FORMAT A28
SQL> COLUMN distance_type      FORMAT A14
SELECT index_name, 
       index_organization,
       num_vectors,
       distance_type,
       index_dimensions,
       default_accuracy
FROM   v$vector_index
WHERE  index_name = 'DBA_ERRORS_HNSW_IDX';


INDEX_NAME            INDEX_ORGANIZATION      NUM_VECTORS  DISTANCE_TYPE  INDEX_DIMENSIONS  DEFAULT_ACCURACY
--------------------  ----------------------  -----------  -------------  ----------------  ----------------
DBA_ERRORS_HNSW_IDX  INMEMORY NEIGHBOR GRAPH  6           COSINE         384               95

Let’s search for errors related to “database cannot connect to remote host”:

SELECT e.error_code,
       e.category,
       e.severity,
       e.error_desc,
       VECTOR_DISTANCE(
           e.embedding,
           vector_embedding(all_minilm_l12_v2 USING 'database cannot connect to remote host' AS data),
           COSINE
       ) AS similarity_distance
FROM   dba_error_kb e
ORDER BY similarity_distance
FETCH FIRST 3 ROWS ONLY;


ERROR_CODE  CATEGORY      SEVERITY  DISTANCE  ERROR_DESC
----------  ------------  --------  --------  -----------------------------------------------
ORA-12541   CONNECTIVITY  3         0.1821    TNS no listener...
ORA-12154   CONNECTIVITY  2         0.2104    TNS could not resolve the connect identifier...
ORA-00060   LOCKING       3         0.4612    Deadlock detected while waiting for resource...

3 rows selected.

Semantically reasonable. But notice ORA-00060 sneaking in at rank 3? A deadlock error has nothing to do with connectivity … but the phrase “waiting for resource” has some embedding overlap with network-waiting concepts. This is exactly the false positive problem I mentioned earlier.

Hybrid Search —-> Semantic + Relational Filter Together … Now let’s be precise. We want connectivity errors only, and only severity 2 or higher:

SQL> SELECT e.error_code,
  2         e.category,
  3         e.severity,
  4         e.error_desc,
  5         VECTOR_DISTANCE(
  6             e.embedding,
  7             vector_embedding(all_minilm_l12_v2 USING 'database cannot connect to remote host' AS data),
  8             COSINE
  9         ) AS similarity_distance
 10  FROM   dba_error_kb e
 11  WHERE  e.category  = 'CONNECTIVITY'
 12  AND    e.severity >= 2
 13  ORDER BY similarity_distance
 14  FETCH FIRST 3 ROWS ONLY;

ERROR_CODE CATEGORY      SEVERITY SIMILARITY_DISTANCE ERROR_DESC
---------- ------------ -------- ------------------- ---------------------------------------------
ORA-12541  CONNECTIVITY         3              0.1821 TNS no listener. The listener process is
                                                  not running on the remote host.

ORA-12154  CONNECTIVITY         2              0.2104 TNS could not resolve the connect
                                                  identifier. Check tnsnames.ora or LDAP
                                                  configuration.

2 rows selected.

Clean. Two results. Both connectivity. Both relevant. No locking errors, no memory errors … even though they might be semantically adjacent in the embedding space.

The application does not need to know it is running vector search. It is just SQL. That line keeps hitting different the more you think about it. Your application devs write a WHERE clause and an ORDER BY. No vector SDK, no external search tier, no Pinecone license.

Full Hybrid Index (Text + Vector in One Index) … For document-heavy workloads .. PDFs, knowledge bases, internal wikis .. 26ai also gives you a combined hybrid vector index using DBMS_HYBRID_VECTOR. This is for when you want BM25 style keyword scoring AND semantic scoring merged into one relevance score:

SQL> BEGIN
  2      ctx_ddl.create_preference('KB_DATASTORE', 'DIRECT_DATASTORE');
  3  END;
  4  /

PL/SQL procedure successfully completed.



SQL> CREATE SEARCH INDEX dba_kb_hybrid_idx
  2  ON dba_error_kb (error_desc)
  3  FOR HYBRID VECTOR SEARCH
  4  PARAMETERS ('
  5      EMBEDDING MODEL all_minilm_l12_v2
  6      VECTOR INDEX TYPE HNSW
  7      DISTANCE COSINE
  8      ACCURACY 95
  9  ');

Index created.


SQL> SELECT *
  2  FROM   TABLE(
  3      DBMS_HYBRID_VECTOR.SEARCH(
  4          p_index_name    => 'DBA_KB_HYBRID_IDX',
  5          p_search_text   => 'listener not reachable on remote server',
  6          p_top_k         => 3,
  7          p_vector_weight => 0.7
  8      )
  9  );

ERROR_CODE CATEGORY      SEVERITY SCORE    ERROR_DESC
---------- ------------ -------- -------- -----------------------------------------------
ORA-12541  CONNECTIVITY         3   0.9321 TNS no listener. The listener process is not
                                           running on the remote host.

ORA-12154  CONNECTIVITY         2   0.7814 TNS could not resolve the connect identifier.
                                           Check tnsnames.ora or LDAP configuration.

ORA-00257  IO                   4   0.2148 Archiver error. Connect internal only until
                                           freed. Archive log destination full.

3 rows selected.

The p_vector_weight parameter is the tuning knob. Set it to 1.0 and you’re doing pure semantic search. Set it to 0.0 and you’re doing pure keyword search. Anything in between blends both scores. That’s a powerful control surface to expose to end users or tune per use case.

AI Vector Search now supports custom, user-defined distance metrics. Proprietary or domain specific distance metrics can be used in addition to the standard built n euclidean, cosine, and dot Product. User defined vector distance functions allow users to create their own custom metrics using JavaScript functions. Niche but powerful for specialized domains …. think genomics, financial signals, or custom similarity scoring for enterprise knowledge graphs.

Oracle has launched Select AI Agent … an in-database framework for building, deploying, and managing autonomous agentic AI workflows. This one’s early stage but worth watching. Agentic AI where the orchestration logic lives inside the database, with full access to your relational data and governance controls, is a compelling architecture alternative to building langchain pipelines outside.

Data Annotations help explain the purpose, characteristics, and semantics of data to AI. This additional information helps AI generate better applications and provide more accurate responses to natural language questions. Think of it as metadata enrichment that makes your schema more legible to LLMs … useful for Select AI natural language query scenarios.

My Take ,…. There’s genuinely no reason to sit on 23.8 or 23.9 at this point.

For the DBAs still on 19c thinking about the migration path ….Oracle supports direct upgrades to 26ai without requiring an intermediate migration to 23ai. This compresses migration effort and improves timelines for modernization projects.

The Unified Hybrid Vector Search is the feature I’d put in front of any architect today. It’s not hype. The ability to do semantically intelligent search without leaving SQL, without a separate vector database tier, without a data pipeline into Pinecone or Weaviate …that is genuinely useful. The HNSW transactional consistency fix alone makes this ready for production workloads that 23.5 couldn’t handle cleanly.

So, go for it .. looad a model. Build one vector table. Write one hybrid query. You’ll see what I mean pretty quickly ๐Ÿ™‚

Hope It Helped!
Prashant Dixit

Leave a comment