Elffar Analytics
  • Home
  • Blog

Elffar Analytics Blog

by Joel Acha

Finally, Gantt Charts Arrive in Oracle Analytics Cloud!

28/10/2025

0 Comments

 
Picture
For as long as I can remember, Oracle Analytics users have been asking for one simple but powerful capability: a native Gantt chart.
Something to track timelines, visualise dependencies, and monitor progress - all within the same dashboards where KPIs and trends already live.

With the November 2025 update, Oracle Analytics Cloud (OAC) finally delivers.
It’s a long-requested feature that transforms how we visualise projects, portfolios, and operational workflows.
Picture
The Wait Is Over

Until now, anyone wanting to visualise schedules inside OAC had to get creative — using bar charts to simulate timelines, or embedding third-party components. It worked, but it was clunky.

The new Gantt Chart visualisation makes this native and intuitive.
It allows analysts and project teams to show project timelines, milestones, and progress bars directly inside their OAC workbooks — fully integrated with data security, filters, and visual interactions.

This isn’t just a pretty new chart. It’s a meaningful step toward operational analytics, where OAC becomes a live window into how work is progressing, not just how metrics are trending.

What’s New in the November 2025 Update
The Gantt Chart visual introduces a new way to represent time-based activities.
Here’s what it currently supports:

• Task timelines: Start and end dates rendered as horizontal bars.
• Progress tracking: Percentage complete shown visually within each bar.
• Milestones: Zero-duration tasks represented as markers.
• Grouping: Organise tasks by project, phase, or resource type.
• Baselines: Display baseline start and end dates alongside actuals for schedule comparison.
• Dependencies: Align related tasks sequentially using shared attributes
• Tooltips: Show contextual details such as owner, status, priority, or duration.

​For many teams - PMOs, delivery leads, or operations managers - this fills a long-standing visualisation gap in OAC. The new Gantt visualisation transforms Oracle Analytics Cloud into a capable project-tracking tool. It bridges the gap between analytics and project management - enabling users to track, analyse, and present progress all in one platform.
Picture
Try It Yourself – Sample Data
To test the new Gantt, I’ve created a realistic dataset that you can import directly into OAC.
oac_gantt_test_data.csv
File Size: 5 kb
File Type: csv
Download File

It contains three concurrent projects:

  • Website Revamp – A full lifecycle example with UX, build, and content streams.
  • ERP Rollout – Multiple workstreams, dependencies, and a go/no-go milestone.
  • Mobile App Launch – Parallel iOS and Android sprints with shared backend integration.

It also includes shared dependencies (e.g. a global change freeze) to demonstrate how Gantt timelines can overlap across projects.

Each row in the dataset represents a task, with columns for start/end dates, duration, status, percentage complete, baseline start/end, and dependencies - all mapped for easy use in the Gantt visual.

How to Build the Gantt in Oracle Analytics Cloud

  1. Upload the dataset (oac_gantt_test_data.csv) to your OAC instance.
  2. Create a new workbook and choose the Gantt Chart visualisation.​
Picture
3. Map the fields as below.
Picture
Once configured, you’ll see your projects laid out across a timeline - with bars showing duration, coloured progress, and milestone markers for key events.
Picture
The Gantt chart shown above gives you a timeline view of your project tasks. Each horizontal bar represents a task’s duration from start date to end date, with markers indicating baselines, milestones and percent complete. This makes it easy to see overlaps, dependencies and progress at a glance.

Why This Matters

This update pushes OAC further into operational reporting territory.
Instead of switching to tools like Smartsheet, Excel, or Project for schedule reviews, you can keep everything inside OAC — governed, secured, and shared via the same semantic model.

For organisations already integrating delivery data (e.g. from Oracle Fusion PPM, Jira, or Primavera) into OAC, this unlocks a new layer of insight:

  • Track schedule health directly in dashboards.
  • Correlate project progress with KPIs and costs.
  • Identify slippage vs. baseline in real time.
  • Present timelines cleanly to executives without exporting to PowerPoint.

Community Demand

This feature has been one of the most upvoted requests on the Oracle Analytics Idea Lab.
Many people in the community have asked for a proper Gantt visual for years, especially those working in delivery or programme management roles.

It’s great to see Oracle Product Management not only listen, but execute — and deliver a native visual that feels integrated, performant, and flexible.

​Final Thoughts

The Gantt Chart visualisation is a small feature with a big impact.
It closes a long-standing gap in Oracle Analytics Cloud and moves the platform closer to a true operational analytics experience.

Whether you’re tracking project sprints, release schedules, or transformation roadmaps, you can now visualise timelines, progress, and dependencies — all in one place, without leaving OAC.
0 Comments

Oracle Analytics Cloud AI Agent: Where Conversations Meet Context

14/10/2025

0 Comments

 
Picture
At a recent Oracle Analytics Partner Meeting, one demo stood out to me (the others were great as well!) - the new AI Agent for Oracle Analytics Cloud (OAC). I’ve since spoken further with the product manager and been granted early access ahead of its LA (limited availability) in the November 2025 release, and I can already see the foundations of something significant taking shape.
​
At first glance, the OAC AI Agent looks and feels similar to the Fusion AI Agent Studio -and that’s no coincidence. Oracle appears to be unifying its redwood AI agent look and feel across platforms, enabling analytics, applications, and custom experiences to have a unified user experience. In OAC, this translates into an embedded conversational interface that sits directly within your analytics workspace. Ask a question, and the agent doesn’t just return a text summary - it understands your semantic model, data lineage, and context before generating a response.
Picture
​From Chatty to Knowledgeable: The Librarian Analogy

To understand what makes this so important, it helps to think of the AI Agent as a librarian.

A large language model (LLM) on its own is like a well-spoken librarian with an excellent memory but no access to your organisation’s archive. Ask them a question, and they’ll respond confidently and eloquently but they’re drawing only on general world knowledge and patterns they’ve learned before. The result often sounds convincing, yet it may lack the precision or evidence that a business decision demands.

The OAC AI Agent, on the other hand, gives that librarian the keys to your private archive. When you ask a question, they don’t just rely on memory and their extensive real-world knowledge; they walk into your own library of governed data, reports, and documents, retrieve the most relevant material, and then craft a response grounded in fact.

That’s the power of Retrieval-Augmented Generation (RAG) - it lets Oracle’s AI Agent combine the fluency of language models with the factual grounding of your enterprise knowledge.

How the OAC AI Agent Works
Creating an AI Agent in Oracle Analytics Cloud

​To begin creating an AI Agent, navigate to the menu and select the Create AI Agent option. 
Picture
​This initiates the process and brings you directly to the AI Agent configuration Immediately upon entering the configuration screen, you are prompted to add a dataset that will serve as the foundation for the AI Agent. It is essential to ensure that this dataset has already been indexed and appropriate synonyms for attributes have been configured. 
 
These preparatory steps are crucial for enabling the AI Agent to effectively leverage the dataset and provide meaningful, context-aware responses.
Picture
Dataset configuration
Picture
Select dataset to add to AI Agent
​You are then taken to the configuration screen
Picture
OAC AI Agent configuration
​Configuring and Supplementing the OAC AI Agent
​

Step 1: Entering Supplemental Instructions
Begin by providing supplemental information that offers the agent additional context regarding its specific use case. Additional prompt Instructions will help the agent better interpret user questions on a functional domain. This ensures the AI Agent is tailored to the unique requirements and environment it will operate within.

Step 2: Defining the First Message
The First Message serves as an introductory text displayed to users interacting with the agent. It describes the agent’s purpose and sets expectations for what the agent is designed to achieve.

Step 3: Saving the Agent
After all relevant information has been entered, proceed to save the agent. This action records the configuration and prepares the agent for further enhancement.

Step 4: Supplementing with Documents
Once the agent has been saved, you can enhance its capabilities by supplementing the previously entered contextual information with additional documents. Uploading these documents grounds the agent in your organisation’s custom enterprise knowledge, allowing it to provide more accurate and relevant responses.

OAC AI Agent: Technical Foundations

At its core, the OAC AI Agent leverages the vector search capabilities of the Oracle infrastructure which forms the backbone of OAC. This vector search enables the agent’s retrieval augmented generation (RAG) functionality, allowing it to efficiently surface relevant information in response to user queries. The OAC AI Agent achieves this by integrating three essential components, each playing a critical role in transforming natural-language questions into trustworthy, contextual insights.

1. Intent Recognition (LLM Layer)
The large language model (LLM) layer is responsible for interpreting what the user is seeking. It analyses the natural-language query to determine the user’s intent and aligns this intent with relevant datasets, key performance indicators (KPIs), or dashboards available within OAC.

2. Retrieval Layer (RAG Engine)
Once the user’s intent has been established, the agent’s retrieval layer searches for pertinent content across a range of defined governed sources. This process begins with OAC’s own semantic model and expands to include external knowledge repositories. Examples custom knowledge files that have been uploaded to the system or supplemental information defined in the AI agent.

3. Response Rendering (OAC Context)
After retrieving the necessary data and knowledge, the information passes through Oracle’s Analytics Visualisation framework. The agent then generates a natural-language response that is firmly rooted in verified data, ensuring that every response respects OAC’s metadata, data lineage, and security protocols.
Picture
Results from an OAC AI Agent
​Key Features and Considerations
Dataset Preparation and Management
​
  • Indexing and Synonym Setup: Ensure that the dataset is properly indexed and that relevant synonyms have been configured. This facilitates more effective and accurate retrieval of information based on user queries.
  • Column Exclusion: Exclude any columns that are not required for analysis or retrieval. This helps streamline data processing and maintains the relevance of the dataset.
  • Supported Document Formats: Custom Knowledge documents can be uploaded in both .PDF and .TXT formats, allowing for flexibility in the types of information included.
  • Single Dataset Limitation: At present, only one dataset can be used at a time, ensuring focus and coherence in data retrieval and analysis.
  • Dataset Filtering: Filters can be applied to the dataset, enabling users to narrow down the scope of information based on specific requirements or criteria.
Context and Accessibility
  • Contextual Insights: Context is derived from both supplemental information and any uploaded documents, ensuring responses are grounded in the most relevant and up-to-date knowledge.
  • Accessibility: The feature is accessible either from the AI Agent menu option or directly via the AI Assistant in the Insights panel, providing users with flexible entry points for their queries.
 
How the OAC AI Agent Delivers Value

The OAC AI Agent produces responses that are designed to be highly effective for business users. This is achieved through a combination of generative AI capabilities, robust grounding in enterprise knowledge, and adherence to organisational standards.
  • Conversational — The responses are naturally conversational, leveraging the power of generative AI to deliver fluid and engaging dialogue that feels intuitive and approachable.
  • Grounded — Each response is firmly anchored in knowledge derived from enterprise data and documents, ensuring accuracy and relevance by referencing up-to-date information from within the organisation.
  • Governed — All output remains consistent with your organisation’s security protocols and definitions, providing confidence that information is managed and shared in line with established governance frameworks.

​This unique blend of conversational fluency and factual accuracy distinguishes the OAC AI Agent from standalone chat-based AI tools, delivering responses that are both engaging and trustworthy for enterprise use.
Picture
​Early Days, Big Potential

​Let’s be clear — this feature is in its infancy. The current build focuses on natural-language exploration incorporating Retrieval-Augmented Generation (RAG) and narrative generation, with a roadmap that will expand its reasoning and automation capabilities over time.

What’s exciting isn’t just the interface, but the architecture that’s emerging beneath it. For the first time, Oracle Analytics is embracing Retrieval-Augmented Generation (RAG). That means the AI Agent won’t rely solely on a large language model to generate responses. Instead, it will retrieve and ground its output in enterprise data and knowledge — both structured and unstructured.

In practical terms, this opens the door for analysts and business users to ask questions that blend internal data with documents, policies, reports, and contextual information stored across the organisation. Whether it’s sales performance data, a product specification PDF, or a customer-service transcript, the AI Agent will eventually be able to bring these sources together to deliver context-aware insights.

Bringing Unstructured Knowledge into the Analytics Conversation

Historically, analytics platforms have struggled to bridge the gap between structured data (tables, metrics, and KPIs) and unstructured information (documents, notes, images, or messages). With RAG, Oracle is moving to close that gap.

This isn’t just about generating summaries — it’s about creating a richer, more informed analytical experience. Imagine asking:

“What were the main factors behind last quarter’s decline in customer satisfaction?”

Today, OAC might point you to a metric or dashboard. With RAG, the AI Agent could augment that response with context drawn from call-centre transcripts, customer feedback reports, or support documentation — all retrieved securely from enterprise knowledge stores.

The result is a shift from data-driven insights to knowledge-driven understanding.

Governed Intelligence, Oracle Style

One of the key advantages here is governance. Unlike standalone chatbots, the OAC AI Agent inherits the same security, metadata, and lineage controls that underpin Oracle Analytics. Responses remain explainable, consistent, and aligned with the organisation’s governed data model — ensuring that insights stay reliable even as AI becomes more conversational.

This approach also complements Oracle’s broader AI ecosystem. The same underlying framework powers Fusion Applications and APEX AI Agents. As these services evolve, we can expect deeper integration, shared prompt orchestration, and unified management of knowledge sources across the Oracle Cloud stack.

Looking Ahead

The OAC AI Agent represents a starting point, not a destination. It’s a glimpse into where analytics is heading — from dashboards and KPIs towards context-aware conversations grounded in enterprise knowledge.

As I explore this feature further through early access, I’ll be focusing on:
  • How RAG is implemented and where knowledge sources can be defined.
  • The interplay between OAC’s semantic model and retrieval layers.
  • How well the agent can integrate unstructured enterprise knowledge into analytical reasoning.

For now, it’s early days — but the direction is clear. With the AI Agent, Oracle Analytics isn’t just adding generative AI to dashboards; it’s laying the foundation for a new class of governed, knowledge-aware analytics experiences.

Stay tuned — I’ll share a deeper hands-on review once the November 2025 update goes live.
0 Comments

Introducing Oracle AI Data Platform: A Unified Foundation for Enterprise AI

3/10/2025

0 Comments

 
Picture
Artificial intelligence is no longer a side project. For enterprises, AI has become a strategic priority—transforming how organisations innovate, compete, and operate. Yet most businesses still struggle with fragmented data pipelines, disconnected tools, and governance challenges that slow down progress with the underlying root cause being how disparate data exists in enterprises.

While 78% of organisations planned to use AI in 2024 (Global AI Adoption Statistics: A Review from 2017 to 2025), the reality is that 68% of these organisations have data silos as their top concern (Data Strategy Trends in 2025: From Silos to Unified Enterprise Value - DATAVERSITY), and siloed data can cost companies up to 30% of their annual revenue (What are Data Silos and What Problems Do They Cause?|Definition from TechTarget). The culprit? The average enterprise runs on nearly 900 applications, with only one-third integrated (What Are Data Silos & Why is it a Problem? | Salesforce US), creating the very fragmentation that prevents AI success.

Think of enterprise data like a busy international airport. Passengers arrive from different places, each with different documentation requirements:
​
  • Structured data is like UK passengers, travelling with standard passports and predictable checks and these passengers can make use of automated technology to speed through the airport.
  • Semi-structured data is like EU passengers, still fairly standardised but with slight differences in documentation and rules.
  • Unstructured data is like international passengers from all over the world, carrying varied paperwork that requires more manual checks and scrutiny.

Without a well-designed terminal, air traffic control, and secure customs processes, it would be chaos.
The new Oracle AI Data Platform (AIDP) is that airport terminal for AI—a single hub where all types of data arrive, is organised, governed, and routed to their various destinations so Analytics tools and AI applications can “take flight” safely and efficiently.

Oracle announced the AI Data Platform at Oracle AI World in Las Vegas on 14 October 2025, and it’s now generally available. Customers can access the live product site and documentation today, meaning you can onboard, configure the Master Catalog, and start building governed lakehouse-plus-AI pipelines on OCI straight away. 
Picture
Why Oracle AI Data Platform Matters

At its core, AIDP helps enterprises do three things better:
​
  • Unify enterprise data for AI: Bring together all your data into a connected platform, removing silos and creating AI-ready pipelines.
  • Accelerate AI development: Use integrated tools, notebooks, and GenAI agent frameworks to move faster without the overhead of stitching together separate environments.
  • Innovate at scale: Orchestrate AI workloads across Oracle and third-party environments, backed by OCI’s optimised infrastructure for cost-effective performance.
Picture
The result? Faster time to value, improved governance, and the ability to scale AI beyond pilots into real enterprise impact.

​A Hypothetical Use Case: From Data Warehouse to AI-Powered Insights

Consider a typical scenario:
  • An enterprise has built a robust data warehouse in Autonomous Database (ADW) to consolidate structured data.
  • Oracle Analytics Cloud (OAC) provides dashboards and visualisations, helping business teams track KPIs and trends.
  • However, AI isn’t being used, and unstructured data—documents, images, logs, call transcripts—sit outside the analytical process.

Here’s how AIDP helps transform this setup:
  1. Bring unstructured data into play: AIDP can ingest and catalogue documents, PDFs, and multimedia alongside structured ADW data, enriching the analytical picture.
  2. Enable AI-driven insights: Data scientists and analysts can use AIDP’s Spark notebooks to apply machine learning models directly on both structured and unstructured datasets.
  3. Governance and trust: With Row-Based Access Control (RBAC), metadata cataloguing, and lineage, all new AI-ready datasets are managed as securely and reliably as the ADW warehouse.
  4. Seamless analytics in OAC: OAC continues as the visualisation layer, now enriched with AI-derived features and predictive insights.
Picture
​In short, AIDP helps organisations move beyond descriptive dashboards to predictive and prescriptive intelligence, while leveraging the investments already made in ADW and OAC.

How Oracle AI Data Platform Supports the Full Data Workflow

One of AIDP’s key strengths is that it covers the entire lifecycle of enterprise data, much like how an airport manages passengers from arrival to departure.
  1. Ingestion from multiple sources (Arrivals Hall)
    • Data enters from many places: SaaS apps, IoT devices, on-prem systems, and third-party feeds.
    • Like flights arriving from different countries, each data source brings its own rules and timing:
      • Structured data (UK passengers) with standard passports and predictable checks.
      • Semi-structured data (EU passengers) with slightly different but still fairly standardised documents.
      • Unstructured data (Other international passengers) carrying diverse documents that require more careful checks.
  2. Data storage (Baggage Claim & Holding Areas)
    • Object Storage manages unstructured data (the oversized luggage, odd-shaped items that don’t fit neatly).
    • Autonomous Database (ADW) holds structured data (the regular suitcases, perfectly tagged and easy to track).
    • Open table formats like Delta Lake, Iceberg, and Hudi ensure every type of “baggage” is stored consistently, like a baggage system designed to handle every airline’s rules.
  3. Transformation and enrichment (Customs & Security)
    • Just as passengers go through passport control and security checks, data must be cleaned, validated, and enriched.
    • Spark-powered compute and workflow orchestration make this process smooth, while ensuring compliance and efficiency.
  4. Governance and security (Immigration & Border Control)
    • The Master Catalog is the record of who entered, when, and what they carried.
    • RBAC and lineage enforce strict policies—only the right people can access the right data, just as border officers verify visas and permissions.
  5. AI and advanced analytics (Departure Gates)
    • Once cleared, passengers board their flights to final destinations.
    • In AIDP, this is where data powers machine learning, GenAI agents, and predictive analytics—transforming raw arrivals into actionable journeys.
  6. Consumption and collaboration (Connections & Departures)
    • Finally, passengers (data) connect to their flights—whether that’s Oracle Analytics Cloud dashboards, third-party BI tools, or Delta Sharing with partners.
    • Smooth transfers ensure data doesn’t get delayed, lost, or misdirected.

By covering every stage of the workflow, AIDP ensures that UK (structured), EU (semi-structured), and international (unstructured) passengers all move smoothly through the airport, reaching their destinations as trusted, AI-driven insights.

​What is the Medallion Architecture?

The Medallion Architecture is a layered data design pattern used to organise data in a data lake or lakehouse for clarity, quality, and reusability. It’s structured into three main layers: Bronze, where raw data is ingested “as is” from source systems; Silver, where data is cleaned, validated, and enriched for consistency and reliability; and Gold, where curated, business-ready data is optimised for analytics, reporting, and machine learning. This layered approach improves data quality at each stage while maintaining traceability from raw to refined insights.
In AIDP, this spans Object Storage, open table formats (Delta/Iceberg/Hudi), and Autonomous Data Warehouse (ADW), all governed by the Master Catalog and RBAC.

Bronze — Land (raw, “as is”)
Purpose: Capture the truth of what arrived, without fixing it yet.

  • Structured (databases/SaaS): Land extracts or CDC snapshots into Object Storage and/or register ADW tables via an External Catalogue for zero-copy access. Keep source fidelity (datatypes, nulls, odd codes).
  • Semi-structured (JSON/events): Land JSON, Avro, CSV, Parquet as-is in Object Storage; record schema hints only.
  • Unstructured (files/media): Land documents, images, audio, logs in Object Storage as Volumes.
  • Operations/Governance: Minimal transforms. Stamp ingest metadata (source, load time, checksum); start lineage; coarse RBAC.
Airport analogy: arrivals hall — busy, mixed, unfiltered.

Silver — Refine (cleaned, standardised, enriched)
Purpose: Make data structurally sound, consistent and joinable.

  • Structured track:

    • Standardise datatypes, units, currencies and codes; de-duplicate; enforce keys and constraints.
    • Build conformed dimensions, SCD staging, and validated facts.
    • Write out as Delta/Iceberg/Hudi or materialise into ADW staging if warehousing downstream.
  • Semi-structured track:

    • Parse/flatten JSON, infer/lock schemas, normalise arrays/maps to relational sets.
  • Unstructured track:

    • Use Spark + OCR/NLP/speech to extract entities/tables/text.
    • Normalise into rows/columns; de-dup; add confidence scores.
  • Convergence:

    • Join structured with extracted signals (e.g., customer_id, invoice_no, email/phone hash for entity resolution).
    • Apply quality tests (row counts, referential integrity, domain checks).
    • Everything catalogued with lineage back to Bronze (files/tables).

Airport analogy: organised lounge — fewer people, rules applied, order emerging.

Gold — Serve (curated, business-ready)
Purpose: Publish trusted datasets for BI, ML and sharing.

  • Warehouse-centric pattern: Load Gold into ADW for fast SQL, governance, and your existing semantic layer; OAC/Power BI/Tableau connect via SQL/JDBC. Ideal when most reporting already lives in ADW.
  • Lakehouse-centric pattern: Keep Gold as Delta/Iceberg/Hudi on Object Storage; expose via JDBC/Delta Sharing; OAC blends lakehouse Gold with ADW facts if needed. Ideal when you want minimal data movement and time-travel/ACID on the lake.
  • Outputs: Conformed facts/dims, KPI marts, and feature tables for ML/GenAI.

​Airport analogy: premium lounge — calm, curated, ready to board.

AIDP makes implementing this pattern simpler, with built-in orchestration and governance.
Picture
What Are Delta Lake, Iceberg, and Hudi?

If you’re new to these technologies, here’s a quick explainer:
  • Delta Lake: Adds reliability to data lakes with ACID transactions, schema evolution, and time travel.
  • Apache Iceberg: Optimised for very large analytic tables, with scalable metadata management.
  • Apache Hudi: Focuses on streaming ingestion and incremental processing.
AIDP supports all three through Delta Uniform, giving enterprises flexibility without lock-in.
Built on Open Source, Delivered as Managed

Enterprises want the flexibility of open source, without the overhead of managing it at scale. AIDP blends the best of both:
  • Apache Spark for scalable compute
  • Delta Lake, Iceberg & Hudi support via Delta Uniform
  • JDBC for BI connectivity to OAC, Tableau, Power BI

The Bigger Picture

With AIDP, Oracle isn’t just building another data platform — it’s constructing the air traffic control tower of enterprise AI. Think of your data as flights arriving from every corner of the globe: structured data landing from domestic routes, semi-structured touching down from across Europe, and unstructured streaming in from long-haul international journeys. AIDP coordinates the safe arrival, organisation, and departure of all of them, ensuring each passenger is where they need to be. By reducing unnecessary transfers, keeping to open flight paths, and providing a single terminal for AI development, Oracle makes sure your entire data estate operates like a well-run airport — efficient, secure, and ready to deliver value.
Picture
​Ready to transform your data chaos into AI-powered insights? Explore Oracle AI Data Platform and see how it can serve as your enterprise's AI airport terminal.
0 Comments

Fusion Data Intelligence Under the Hood: A Deep Dive into the Architecture

20/7/2025

0 Comments

 
Picture
In the previous post, we traced how Fusion Data Intelligence (FDI) evolved from OBIA. In this second instalment of our FDI‑introductory series, you’ll explore the underlying technology and architecture that power FDI’s cloud-native analytics platform.
2. The FDI Architecture Ecosystem (The “Big Picture”)

At its core, Fusion  Data  Intelligence (FDI) is a fully managed, cloud-native analytics platform running on Oracle Cloud Infrastructure (OCI). It stitches together your Fusion Cloud Applications, Oracle-managed data pipelines, Autonomous Data Warehouse (ADW), and Oracle Analytics Cloud (OAC) into a seamless, scalable end-to-end analytics solution - one that Oracle deploys, operates, and continuously evolves for you (there is some configuration that administrators need to carry out).

First, Fusion Cloud SaaS applications - including ERP, HCM, SCM and CX pillars - serve as the transactional data sources. Oracle provides prebuilt ingestion pipelines tailored to each functional Pillar, handling everything from data extraction and change data capture (CDC) to transformation and consistent mapping into analytics-ready format .

These pipelines write data directly into an OCI-hosted Autonomous Data Warehouse, which transform and load the Fusion data into a unified star-schema data model covering multiple functional domains. The schema is:
  • Immutable and prebuilt, reducing modelling effort,
  • Extensible, allowing for additional external or custom Fusion data,
  • Optimised for high-performance querying across SaaS application pillars via Conformed Dimensions like customer, product, ledger and fiscal calendar .
This architecture supports refreshes that can be scheduled incrementally or on demand, with zero downtime - ensuring analytics remains uninterrupted during data pipeline updates. Custom flexfield extensions from Fusion Applications can be included in the pipelines, bringing bespoke business data into the analytics layer.

Once data arrives in the Autonomous Data Warehouse (ADW), Oracle Analytics Cloud takes over for semantic modelling and visualisation. A prebuilt semantic layer wraps the raw star schema into business-friendly subject-area views - covering finance, human resources, supply chain and customer experience - complete with standardised key metrics and dashboards .

Through OAC, FDI delivers not just dashboards but intelligent, action-driven analytics, featuring natural-language querying, ML-based forecasting and anomaly detection to name just a few.

🔗 Summary Flow

  1. Fusion SaaS Apps (ERP/HCM/SCM/CX) →
  2. Oracle-managed data ingestion pipelines (CDC, ETL, flexfield support) →
  3. Star-schema data model in Autonomous Data Warehouse (extensible star schema, conformed dimensions) →
  4. Oracle Analytics Cloud (semantic layer, dashboards, AI powered insights and intelligent apps).

This end-to-end ecosystem is fully managed by Oracle - covering provisioning, upgrades, performance tuning, and integration with Fusion App releases - offering a friction-free, scalable approach to enterprise analytics (there is some configuration that needs to be done by administrators).
Picture
3. Data Movement & Integration

​​FDI’s data movement layer is built around Oracle-managed, prebuilt pipelines that automate ELT and Change Data Capture (CDC) for Fusion Applications (ERP, HCM, SCM, CX). These pipelines are configured and controlled through the intuitive FDI Console, making it easy for administrators to activate, modify or schedule updates with minimal effort. You don’t need to build complex ETL processes - Oracle handles the heavy lifting, while you focus on business relevance and reporting needs .

By default, data pipelines are incremental with zero downtime, keeping analytics up-to-date without interrupting service. You also have the flexibility to perform on-demand full reloads, useful for data corrections or model updates - all managed with just a few clicks in the Console .

Crucially, the architecture supports extensibility in two key ways:
​
  1. Fusion data flexibility – Custom flexfields defined in Fusion Apps are automatically picked up and mapped into the ADW schema without additional development .
  2. External data ingestion – You can supplement your analytics with data from outside Oracle, using FDI Console’s data augmentation connectors (e.g. Salesforce, EBS, PeopleSoft, Shopify), self-service file uploads (Excel), direct ETL tools like Oracle Data Integrator, Oracle Integration Cloud or your choice of third-party loaders to a custom schema in the same ADW instance. For bulk external loads, FDI supports bulk data augmentation jobs that can be scheduled and monitored via the Console UI .

All pipelines and augmentations are managed through the FDI Console. As an administrator, you can configure initial parameters - such as extract start dates, currency preferences, and schedule frequency - directly in the console interface. Any subsequent edits to pipelines, functional areas, or augmentations are seamless, with Oracle handling deployment and execution behind the scenes
Picture
✅ Summary: Core Benefits of FDI Pipelines
Feature
Benefit
Oracle Managed Pipelines
Low maintenance, no custom ETL
Console based configuration
Easy scheduling without code
Zero downtime incremental loads
Fresh data, uninterrupted analytics
Flexfield support
Easy to add custom business fields
External data extensibility
Blend Fusion and non-Fusion data into the same ADW.
4. Lakehouse & Warehousing Foundation

At the heart of Fusion Data Intelligence lies a star-schema model deployed on Oracle’s Autonomous Data Warehouse (ADW) - a cloud-native, self-tuning database that underpins fast, enterprise-grade reporting and analytics. Here’s how it’s structured and why it matters:

⚙️ Prebuilt Star Schema in ADW
When FDI is provisioned, Oracle automatically creates a prebuilt star schema in ADW. This schema includes fact tables and a network of conformed dimensions - shared across multiple functional areas - that serve as the glue for cross-pillar analytics.

Common dimensions include:

  • Customer / Common Party – used across AR, AP, SCM and CX for a 360° view of customer interactions 
  • Supplier – central to procurement and payables
  • Product – essential for SCM, finance, and sales analysis
  • Fiscal Calendar – shared across financial, HCM, project and supply chain modules
  • Business Unit / Ledger – enabling segmentation and consolidated reporting

These shared dimensions enable users to analyse, for example, how procurement spend (SCM) impacts cash flow (finance), or how HR-driven workforce changes correlate with sales performance - a cross-functional insight made possible by a common semantic backbone.

🏗️ Support for External Data & Custom Schemas

​FDI doesn’t just ingest Fusion source data - it enables easy integration of external datasets into the same ADW environment. Whether it’s non-Oracle systems, legacy data, purchased data feeds, or even weather information, FDI supports loading external tables into custom schemas that can extend the star schema and semantic model.

This extensibility is key to bridging out-of-the-box analytics with bespoke business insights - enhancing customer segmentation, supplying additional cost drivers to per-product profitability, or blending external KPIs directly alongside Fusion metrics.

🔍 Benefits of the Lakehouse Foundation

  1. Cross-pillar consistency: Reporting across HR, finance, supply chain and CX benefits from shared dimensions and semantic logic.
  2. Query performance: ADW’s columnar, parallel engine delivers high performance on aggregated workloads across large datasets.
  3. Scalability & elasticity: Cloud-native scaling ensures performance keeps pace with data growth—without manual tuning.
  4. Governance in one place: Shared data model and schema simplify security, data lineage and audit tracking.
✅ Summary
Under the hood, FDI’s star-schema in ADW provides a robust, extensible greenfield analytics foundation. Built on conformed dimensions and a scalable data warehouse, it enables seamless mash-ups of Fusion data with external sources, supporting rich, multi-domain analytics that truly span the enterprise.
Picture
5. Semantic Layer & Pre‑Built Metrics

FDI abstracts hundreds of physical tables into logical business subject areas - finance (GL profitability, AP ageing, AR revenue, Trial Balance), HCM (talent acquisition, workforce core), procurement (spend, POs), and CX (campaign ROI, opportunity pipeline) - all underpinned by conformed dimensions. It includes a KPI library with over 2,000 standard metrics, accessible via Oracle Analytics Cloud’s intuitive key-metric editor and drag‑and‑drop visualisations. In essence, this semantic layer creates a unified business vocabulary that simplifies reporting and ensures consistency across the enterprise  .

🔐 Complimenting Fusion-Defined Security

FDI leverages Fusion’s built-in role-based security model, so the semantic layer inherits data roles, duty roles, and row/object-level filters defined in Fusion Cloud Applications. Access control is enforced through the Oracle Identity and Access Management (IAM) Service and the FDI Console, ensuring that users only see data they’re authorised to view. This unified approach simplifies administration and compliance by avoiding double entry of security definitions  .

🧩 Hiding Complexity Through Logical Abstraction

Rather than exposing raw tables, FDI offers a logical semantic layer that shields users from underlying complexity. Here’s what it achieves:

  • Simplified reporting — Subject-area views like “GL Profitability” or “Spend by Supplier” present meaningful business concepts, not table joins.
  • Consistent vocabulary — Terms like “Customer”, “Supplier” or “Fiscal Calendar” remain consistent across pillars and dashboards.
  • Reusability — Prebuilt key metrics can be used across workbooks and dashboards without redefinition.
  • Governed extensibility — Analysts can add custom dimensions or metrics without touching base tables, via semantic extensions that maintain upgrade compatibility  .

✅ Summary: User Experience & Governance Wins
Benefit
Description
Business-friendly UX
Users interact with semantic views, not complex table structures.
Aligned metrics
Same definition of revenue, spend, headcount across all reports.
Security-by-design
Fusion’s security model applies seamlessly across the semantic layer.
Safe extensions
Semantic extensions allow customisation without jeopardising prebuilt content
Picture

6. Visualisation and Intelligent Dashboards

At the heart of Fusion Data Intelligence’s presentation layer is Oracle Analytics Cloud (OAC) - a robust analytics platform that delivers pre‑built dashboards, workbooks, visualisations, and natural‑language queries right out of the box. These are tightly embedded into the FDI ecosystem, enabling immediate use of richly designed visuals without the need for manual development.

📦 Out‑of‑the‑Box Dashboards & Catalog Extensibility

FDI comes with a comprehensive set of functional dashboards tailored to key business areas - such as GL profitability, workforce analytics, procurement spend, and campaign ROI. These are readily available via the OAC catalog, simplifying discovery and speeding up time to insight. Moreover, the catalog supports extensibility - you can add new dashboards, visualisations or analytics modules to the catalog, making them available organisation-wide once published.

🛠 Self-Service Data Exploration & Ad Hoc Analytics

Beyond consuming pre-built content, OAC empowers users to bring in their own datasets - whether spreadsheets, CSVs or other database connections - for ad hoc querying and analysis. Business users can combine external datasets with FDI’s semantic model, build bespoke workbooks, and share insights - all without requiring IT support. This self-service capability eliminates bottlenecks and fosters data-driven creativity.

🤖 Embedded AI & ML Capabilities

OAC also delivers built-in AI and Machine Learning features, including:
AI Assistant, that opens up access to insights by natural language querying and a Large Language Model.
Auto‑insights, which automatically surface notable trends and anomalies; Natural‑language querying, allowing users to ask questions conversationally; Access to OCI AI services and Oracle Machine Learning (OML) for in‑platform predictive analytics and custom AI model execution.

This extends FDI beyond static reporting to an intelligent, self-optimising analytics system.

✅ Why This Matters 

Capability
Value Delivered
Immediate Value
Pre-built dashboards reduce time to insight.
Discoverable & Extendable
The OAC catalogue allows cataloguing and publishing of new content across teams.
Empowered Users
Self-service analytics removes IT dependency and encourages exploration.
AI-Powered Insights
Auto-insights and natural-language queries bring intelligence to every user.
Actionable Intelligence
Embedded intelligent apps provide signals with recommended next steps.
Picture
7. Governance, Security & Lineage

Fusion Data Intelligence isn’t just about delivering insights - it’s built on a robust foundation of security governance and data lineage that brings trust, safety, and compliance to the analytics lifecycle.

🔐 Security Inherited from Fusion & Managed via OCI IAM

FDI inherits its security framework directly from Fusion Cloud Applications. Role-based access, including data roles and duty roles configured in Fusion, are seamlessly enforced within the FDI semantic layer and Autonomous Data Warehouse (ADW). This ensures that users can access only the data they are authorised to see - without duplicating access definitions in multiple systems.

User and group management within FDI is handled through OCI’s Identity and Access Management Service (IAM). You can sync your Fusion App users and roles into OCI IAM or manage them natively via OCI, and then assign access through system and job-specific groups tailored to FDI. This 1:1 mapping ensures governance is inherited and consistent across both transactional and analytics layers.

Oracle also manages infrastructure-level security - covering upgrades, patching, encryption, IAM policy enforcement, key management, and auditing - helping to maintain compliance and relieve the operational burden on your team. 

🧭 Data Lineage & Quality Built-In

​Trusted analytics demand transparency - and FDI delivers that through built-in data lineage and validation mechanisms. The system tracks the flow of data from source tables in Fusion Apps, through ingestion pipelines, into curated star schemas, and finally into Semantic Layer metrics and dashboards.

Fusion SCM Analytics documentation provides end‑to‑end lineage spreadsheets that detail column‑ and table-level mappings, making it easy to trace every KPI back to its source fields. You can also monitor pipeline activity in the FDI Console, which records execution timestamps, row counts, and error logs - providing a clear audit trail of data loads and transformations.

Further, FDI includes validation metrics that reconcile data loaded into ADW against transactional data in Fusion. These can be scheduled or run on‑demand, with reports surfaced directly in OAC - making it easy to identify data drift or discrepancies and swiftly pinpoint areas for correction
​
✅ Summary: Trust, Safety, and Compliance
Dimension
Feature
Benefit
Security
Inherited Fusion roles, OCI IAM management
No duplication—consistent access enforcement
Governance
Encryption, patching, and audit via Oracle-managed OCI stack
Reduced admin overhead and compliance risk
Lineage
Column-level mappings and pipeline logs
Traceability from source to dashboard
Data Quality
Built-in validation jobs & reconciliation workbooks
Confidence in data accuracy and integrity
Picture
8. Why This Architecture Matters for Organisations 🚀
Fusion Data Intelligence goes far beyond traditional BI. It sits at the heart of Oracle’s broader Data Intelligence Platform, delivering a unified, 360° view across all enterprise data—transactional, analytical, structured, and unstructured  .

🌟 A Unified Data-Intelligence Ecosystem

Unlike legacy stacks - OBIA, ODI, siloed data centres - FDI is built on Oracle’s next-generation Data Intelligence Platform. It blends data lakes, Autonomous Data Warehouse, Oracle Analytics Cloud, OCI AI services, and GoldenGate streaming into a seamless, managed ecosystem  . This means organisations can now handle batch and real-time data, include external sources and apply AI/ML—all within one secure environment. 

This is Oracle's vision as Data Intelligence Platform has been announced but is not yet generally available. 

🔄 Consistent Insights Across Pillars

FDI’s architecture supports conformed dimensions and shared semantic models spanning finance, HR, SCM, and CX. This allows for unified KPIs and analytics, enabling stakeholders to ask and answer cross-domain questions like:

  • “How is workforce turnover impacting revenue?”
  • “What supply chain inefficiencies are driving higher costs?”

The result is enterprise-wide analytics based on a single source of truth  .

💡 Full Extensibility with Governed Access

As part of Oracle’s Data Intelligence Platform, FDI offers extensive extensibility. Users can bring in external datasets, extend semantic models, build custom analytics, and consume OCI AI services - all within Oracle’s security framework. Governed self-service means broad analytical freedom without compromising data integrity  .

🛠 Evergreen Platform, Zero Infrastructure Burden

The platform is fully managed and evergreen. Oracle handles everything - from provisioning, patching, tuning, and upgrades to integrating the latest AI services. Teams can focus on driving value rather than
wrestling with infrastructure  .
Picture
🎯 Summary: Strategic Differentiators​​
Capability
Benefit
360° Data Integration
Unified platform for Fusion, external, structured/unstructured data
Cross-domain analytic
Shared metrics enable enterprise-wide decision support
AI + embedded actions
From insights to intervention with intelligent apps
Extensible & secure
Custom analytics and models within governed environment
Zero‑touch management
SaaS simplicity with no technical debt or manual upgrades
As you’ve seen, Fusion Data Intelligence delivers a fully managed, cloud-native analytics ecosystem - bringing together Fusion SaaS, Oracle’s Autonomous Data Warehouse, and Analytics Cloud under one secure, AI-enhanced platform. It unifies data across domains, embeds intelligent insights and governance, and eliminates legacy complexity - truly delivering on Oracle’s vision of a Data Intelligence Platform. Now it’s your turn: take a moment to reflect on how FDI could accelerate insight‑driven transformation in your organisation. 
0 Comments

From OBIA to Fusion Data Intelligence: Enterprise Application Analytics Reinvention

24/6/2025

0 Comments

 
Picture
Over the years, many of us working in the Oracle analytics space have helped customers implement Oracle Business Intelligence Applications (OBIA) - a powerful solution in its time, offering prebuilt analytics across ERP, HCM and more. If you ever spent hours managing DAC, tweaking ETL mappings, or retrofitting OBIA customisations after a patch - you’ll understand why Fusion Data Intelligence feels like Oracle finally got analytics right.But let’s be honest: it had its fair share of complexity, rigidity, and technical debt.
​
Fast-forward to today and we’ve entered a new era with Oracle Fusion Data Intelligence (FDI) - a reimagined, cloud-native analytics platform designed from the ground up for the Fusion SaaS landscape. And if you’ve ever battled with OBIA’s extensibility, upgrade cycles or data latency, FDI is likely to feel like a breath of fresh air.

This post is the first in a short series unpacking what FDI actually is, how it compares with its predecessors, and what it means for Fusion customers today.

Oracle's recent growth

Over the past 2–3 years, Oracle has consistently grown its cloud business, with total revenue rising from $40.5 billion in FY2022 to $57.4 billion in FY2025, driven largely by strong momentum in Fusion Cloud Applications, NetSuite, and OCI (Oracle Cloud Infrastructure).

While Oracle doesn’t match the scale of hyperscalers like AWS or Microsoft Azure in infrastructure alone, its distinct advantage lies in its full-stack strategy - uniquely offering enterprise SaaS, infrastructure, and the database layer under one roof. This vertically integrated model means Oracle can optimise performance, security, and cost across its stack, especially for Fusion workloads. Competitors like SAP and Workday lead in applications but lack native cloud infrastructure; AWS and Azure dominate infrastructure but rely on third-party SaaS partners.

​Oracle, by contrast, continues to blur the lines between application and platform, using technologies like Autonomous Database, OCI Gen2, and now Fusion Data Intelligence to deliver insights that are deeply embedded, secure, and performant - all within its own ecosystem.
Picture
These figures aren’t just impressive - they’re a strong signal that Oracle’s SaaS portfolio is achieving scale and maturity, particularly in core enterprise functions like Finance, HR, and Operations. Fusion ERP alone has grown from $0.9B to $1.0B in quarterly revenue, underscoring widespread enterprise adoption.

From Adoption to Insight: The Next Frontier

​As organisations continue investing in Oracle Fusion Cloud applications, the expectation isn’t just automation - it’s intelligence. Businesses aren’t content with simply moving transactional processes to the cloud; they want to understand the return on those investments, monitor performance in real time, and use their data to make faster, smarter decisions.
This is where Fusion Data Intelligence (FDI) steps in.
Just as Oracle’s adoption of Fusion SaaS pillars is accelerating, so too is the demand for embedded, governed, cross-functional insights that empower users in the flow of work. With SaaS platforms becoming the new systems of record, the analytics layer must evolve in lockstep - and be natively integrated, secure, and scalable.
FDI is that evolution.

Why FDI Matters Now More Than Ever
  • Data gravity has shifted - core operational data now lives in Fusion Cloud ERP, HCM, SCM, and CX.
  • Decision-makers want answers in context - not in a separate BI tool, but embedded inside the applications they use every day.
  • Cloud-native analytics isn’t a nice-to-have; it’s a business requirement for agility, accountability, and optimisation.

​FDI bridges this critical gap by turning raw operational data into actionable intelligence - all while aligning with the Fusion application security model, lifecycle, and extensibility standards.
SaaS Pillar
Q4 FY2025 Revenue
YoY Growth
Fusion Cloud ERP
$1.0B
+22%
Netsuite Cloud ERP
$1.0B
+18%
Total Cloud Applications (SaaS)
$3.7B
+12%
Looking Back: OBIA Was Revolutionary — But the World Has Moved On

​When it launched, Oracle Business Intelligence Applications (OBIA) was genuinely ahead of its time. Prebuilt subject areas, KPI dashboards, and ETL pipelines for ERP, HCM, SCM, and CRM systems allowed organisations to fast-track enterprise reporting without starting from scratch. OBIA gave business users actionable insights over operational systems, and it helped many enterprises move beyond siloed spreadsheets into a more governed BI model.
Picture
But OBIA came with constraints that, over time, became significant limitations:

  • Customisations were fragile: Even minor changes to the ETL or data model could break during upgrades, making custom development expensive and hard to maintain.
  • Siloed architectures: Each functional pillar (e.g. Oracle EBS, PeopleSoft, Siebel, JDE) had its own data model and ETL codebase, with no unified customer or product view.
  • Upgrade pain: OBIA upgrades were major projects in themselves - often requiring reimplementation or heavy retrofitting for customised solutions.
  • Customer-managed infrastructure: Customers had to provision, maintain, and tune on-premises environments — including DAC, Informatica, BI Server, and database platforms - with all the associated complexity and cost.
  • Slow to evolve: The on-prem model struggled to keep up with rapidly changing business requirements and the rise of cloud-based enterprise applications.

The Modern Alternative: Fusion Data Intelligence

With Fusion Data Intelligence (FDI), Oracle has reimagined what enterprise application analytics should look like in the cloud era.
  • Extensibility is core, not an afterthought: FDI is metadata-driven and designed to be extended - securely, scalable, and without breaking with every update.
  • Unified cross-pillar model: Instead of siloed data marts, FDI provides a 360-degree customer, employee, and transaction view across Fusion ERP, HCM, SCM, and CX — built on a shared semantic model.
  • Evergreen analytics: As Fusion Apps evolve, so too does FDI — delivered as a SaaS analytics layer that evolves in sync with Oracle’s application roadmap.
  • Oracle-managed infrastructure: FDI is provisioned, patched, secured, and optimised by Oracle - no DAC, no Informatica, no infrastructure headaches.
  • AI and machine learning embedded: Insights aren’t just historical - FDI includes predictive models, anomaly detection, and natural language queries right out of the box
Picture
From OBIA to OAX to FAW to FDI: An Analytics Evolution

FDI didn’t appear out of nowhere - it’s the result of five years of iterative development across multiple product identities. It began as Oracle Analytics for Applications (OAX), introduced around 2019 as a cloud-based successor to OBIA. OAX was designed to deliver prebuilt analytics for Oracle Fusion Cloud Applications, leveraging Oracle Autonomous Data Warehouse and Oracle Analytics Cloud. In 2020, OAX was rebranded as Fusion Analytics Warehouse (FAW), marking a shift toward a more unified, extensible platform. FAW introduced modular “pillars” aligned with business domains--ERP, HCM, SCM, and CX—each offering curated data models, semantic layers, and prebuilt KPIs. Over the next few years, Oracle expanded these pillars with hundreds of subject areas and embedded machine learning for predictive insights.

In 2024, FAW was renamed Fusion Data Intelligence (FDI). This rebranding emphasized its broader mission: not just warehousing analytics, but enabling intelligent decision-making across the enterprise. FDI retained the core architecture—Autonomous Data Warehouse, Oracle Analytics Cloud, and managed pipelines—but added enhanced extensibility, data sharing capabilities, and a more intuitive console for governance and customisation.
Picture
In short, where OBIA was revolutionary for the on-prem era, FDI is purpose-built for the cloud-native enterprise. It meets today’s expectations for agility, integration, governance, and intelligence - without the baggage of yesterday’s architecture.
Looking Ahead

​This post was just the beginning. Over the next few instalments, we’ll dive deeper into the nuts and bolts of Fusion Data Intelligence - from how it handles extensibility and embedded insights, to what it means for Fusion customers trying to move beyond dashboards and into decision intelligence.

FDI represents more than just a new analytics tool - it’s a shift in how Oracle customers can extract value from their SaaS investments. If you’ve ever found yourself battling data silos, struggling with upgrades, or explaining to stakeholders why reporting still takes days, this series is for you. Stay tuned.
0 Comments

Insights from Unstructured Data: Oracle Analytics and AI Document Understanding

8/6/2025

0 Comments

 
Picture
When we think about business data, we usually picture tidy tables and dashboards neatly populated with structured relational data. But in reality, much of an organisation’s most valuable information lives in unstructured formats—scanned invoices, PDFs, handwritten notes, and contracts. This data is often locked away in silos, disconnected from the wider analytical ecosystem.

Oracle Analytics’ AI Document Understanding feature changes that. It enables organisations to automatically extract structured data from documents stored in OCI Object Storage using pretrained AI models—all without needing a data science team. With this capability, you can enrich dashboards with data that would previously be too costly or complex to access.

In this post, we’ll walk through:

  • What the AI Document Understanding feature is
  • Practical business scenarios where it is applicable
  • Step-by-step setup and configuration, including OCI policy management and model registration
  • Tips and tricks for working with unstructured data in Oracle Analytics
What Is Oracle Analytics AI Document Understanding?

At its core, the AI Document Understanding capability in Oracle Analytics leverages AI models (deployed within Oracle Cloud Infrastructure) to parse and extract fields of interest from documents stored in OCI Object Storage. This is particularly powerful for automating workflows that currently depend on manual data entry or semi-structured file formats.

It supports a range of document types and layouts, including:

  • PDF invoices from suppliers
  • Utility bills or receipts
  • Financial statements
  • Handwritten or printed forms (subject to OCR accuracy)
The output is a structured dataset—columnar data ready for blending, analysis, and visualisation in Oracle Analytics.
Picture
IAM Policies
​
To enable Oracle Analytics to securely access documents stored in OCI Object Storage and to invoke AI services like Document Understanding, specific IAM policies must be in place. Without these policies, your OAC instance won’t have the necessary permissions to read documents or trigger AI model processing. In this section, we’ll walk through the exact tenancy- and compartment-level policies required, ensuring your setup is both functional and secure. You can find more information here.
The following IAM policies grant Oracle Analytics the necessary permissions to read from your Object Storage bucket and to invoke the AI Document Understanding service.
Compartment level IAM Policy

Notes

  1. Replace <GROUP NAME> with your OCI IAM Group name
  2. Replace <COMPARTMENT NAME> with your OCI Compartment name where you Object Storage bucket is located in your OCI tenancy
  3. Replace <BUCKET NAME> with your Object Storage Bucket name where your files have been uploaded to. Note that the Bucket Name is in single quotes.​
​
​Next policy needs to be defined at the root compartment level 
Root level IAM Policy

These policies are necessary to enable Oracle Analytics to access the OCI AI Document Understanding model. Without these policies correctly setup, you will encounter errors when you attempt to run your data flow in Oracle Analytics. 
Picture
With the IAM policies configured, you can now proceed with setting up the connection and registering the model within Oracle Analytics.

You do this by creating an Oracle Analytics connection to your Oracle Cloud Infrastructure tenancy that will enable you to gain access to your OCI Object Storage Bucket.
Register a pre-trained Document Key Value Extraction model with your Oracle Analytics instance ensuring that the bucket created previously is  selected.
Picture
This completes all prerequisites and the next step is to run the newly registered pre-trained model in Oracle Analytics by creating a data flow.

The next step is to create a create a "dataset" which is used as an input to the data flow. This dataset is a CSV file that contains the OCI object storage bucket URL where the documents have been uploaded to. 

This CSV file can either contain a row for each document with a URL for each document that you intend to process or a single row with a URL for the bucket itself. This way every document within this bucket will be processed. Personally, it's a no brainer for me to use the second option.  As mentioned earlier in this article, you need to derive the bucket URL by logging on to the OCI console's bucket details page and copying the URL from your browser. You can see a sample below that has 2 tabs; the 1st tab is what you would use for option 1 where you list out your documents with their corresponding URLs. The 2nd tab has a single row and this is what you would use to instruct the data flow to process all documents within the specified bucket.
Template Bucket URL dataset
File Size: 10 kb
File Type: xlsx
Download File

Picture
Picture
Follow the instructions here to create your data flow.
Picture
Using the Apply AI Model step, you make a call the the registered pretrained AI Document Understanding model. You then add a Save Data step in which you specify the output dataset. In my example below, I have a few Transform Column steps which are being used to execute some transformations to some columns.
Picture
Once the data flow has been saved, it can be run to generate the output dataset. You can see a sample data visualisation workbook below based on the output dataset with some insights of the information derived from the invoices.
Picture
Tips and tricks for working with unstructured data in Oracle Analytics

Working with unstructured documents—especially at scale—introduces its own set of quirks. Here are some practical insights to help you get the most out of the AI Document Understanding feature in Oracle Analytics:

Use Document Batching Strategically
Oracle Analytics currently imposes a 10,000-row processing limit per run. If you’re working with high volumes:

  • Break documents into batches and use multiple data flows.
  • Consider filtering by date or document type.​
 
​Reuse and Schedule Data Flows
Once you’ve built a data flow that works, save it and schedule it to run regularly:

  • Use the built-in scheduler in Oracle Analytics.

​Start Small, Then Scale
Try a proof-of-concept with 10–20 documents first:

  • Test how accurate the model is.
  • Check how the data appears in the dataset.
  • Adjust your approach based on results—especially if OCR noise is high.
Resources

​Official Documentation
Some YouTube videos from the Oracle Analytics channel:
Gotchas, Limits and Tips

1. Bucket URL Must Be Copied from Browser
The most confusing part of this setup is finding the correct OCI Object Storage bucket URL. It’s not visible anywhere in the console UI—you must copy it from the bucket’s detail page URL in your browser.
2. 10,000 Document Row Limit
There’s a hard limit of 10,000 document rows per data flow run. If your use case involves large volumes of documents, you’ll need to split your data or automate batch runs accordingly. Note that this limit is even less when a custom model is used. The limit in this scenario is 2,000 documents. 
3. Document Layouts Matter
The AI model is pre-trained for certain layouts (e.g. invoices, forms). Custom layouts may yield mixed results, and you may need to experiment with field mappings to improve outcomes.
4. Use Tags for Traceability
Tag your buckets and policies in OCI with labels like oac-ai-docs so they’re easier to audit and maintain.

Conclusion

Oracle Analytics’ AI Document Understanding feature bridges a crucial gap between unstructured documents and visual analytics. With a few setup steps—bucket creation, IAM policy configuration, model registration, and a simple data flow—you can surface hidden insights from documents that would otherwise sit untouched.

It’s a powerful tool, but one with nuances—such as the hidden bucket URL and processing limits—that are worth planning for. Still, for anyone looking to extend their analytics to the edges of their data estate, this capability opens the door. Oracle Analytics now makes it possible to integrate scanned documents, invoices, and other unstructured data sources directly into your dashboards—unlocking insights that were previously out of reach.​
0 Comments

Optimising Performance in Oracle Analytics Cloud: A Deep Dive into Extracted Data Access Mode

10/5/2025

0 Comments

 
Picture
The May 2025 update to Oracle Analytics Cloud (OAC) introduces a significant new feature designed to boost performance and reduce dependency on source systems: the Extracted data access mode. This new capability is especially valuable for enterprise users seeking to optimise dashboard responsiveness, reduce backend load, and deliver consistent performance across a variety of usage scenarios. In this expanded post, we’ll delve into what Extracted mode brings to the table, compare it with the existing Live and Cached modes, and offer guidance on how to get the most value from it.
Understanding Data Access Modes in Oracle Analytics Cloud
To fully appreciate the advantages of the new Extracted mode, it helps to revisit the existing data access modes in Oracle Analytics Cloud — namely Live and Cached. Each mode supports different use cases, with varying implications for data freshness, system performance, and architectural complexity.
Live Mode
In Live mode, Oracle Analytics executes every query directly against the source system in real time. Whether a user is exploring a dashboard, applying filters, or drilling into data, each action sends a query to the backend database.
Advantages:
  • Delivers the most current, up-to-date data.
  • No need to manage refresh schedules or data synchronisation.
  • Well suited for operational reporting or scenarios requiring real-time insight.
Limitations:
  • Performance is dependent on the source system's speed, load, and query optimisation.
  • High concurrency or complex dashboards can introduce latency.
  • Potential to introduce heavy load on transactional systems.
Cached Mode
Cached mode creates a temporary local copy of query results within OAC’s cache layer. This cache is generated on-the-fly when users first load a dashboard or perform a query and reused in subsequent interactions where applicable.
Advantages:
  • Provides improved performance over Live mode by reducing repeated source queries.
  • Helps to offload traffic from backend systems.
  • Ideal for static or slow-changing datasets.
Limitations:
  • Cache is unpredictable — built based on query patterns, not pre-defined schedules.
  • May return stale data if the cache isn’t invalidated or refreshed.
  • Limited reusability across users or sessions — each user's interactions influence their cache.
Picture
Introducing: Extracted Mode (New in May 2025)
The newly introduced Extracted mode provides a more structured and predictable alternative. It allows dataset creators to perform a full extract of data from a source system and store that extract directly within Oracle Analytics. Unlike Cached mode, this data snapshot is proactively managed and completely reusable.
Key Benefits of Extracted Mode:
  • Delivers the highest performance, since all queries are resolved within OAC’s internal storage engine.
  • Removes dependency on the source system’s availability or performance.
  • Suitable for use in mission-critical dashboards, sandbox experimentation, and shared analytical content.
Comparison Table: Live vs Cached vs Extracted Mode
Picture
Cached vs Extracted Mode (Quick Reference):
Picture
Considerations:
  • Extracted mode is not intended for near real-time analytics — data is as current as the last scheduled refresh.
  • Storage consumption needs to be managed, especially in environments with many large datasets.
  • Careful governance is required to ensure extract schedules align with business requirements.
Picture
Creating and Managing Extracted Datasets in OAC
Working with Extracted mode is a straightforward process within Oracle Analytics Cloud’s interface. Here’s a step-by-step guide:
  1. Start with a Dataset: Navigate to the Data page and create a dataset using your source connection (such as Oracle DB, ADW, or Fusion Apps).
  2. Select Extracted Mode: During the dataset setup, choose Extracted as the data access mode. You can also switch an existing dataset to this mode by editing its properties.
  3. Configure the Refresh Policy: Set a refresh schedule that reflects your data update needs — daily, weekly, or at custom intervals. Manual refresh is also available.
  4. Monitor and Maintain: The UI shows the last refresh time and status. You can also manually refresh the dataset or update the schedule as required.
  5. Use Across Projects: Once extracted, the dataset is immediately available for use in DV projects, data flows, and dashboards without re-querying the source.

​Additional Tips:
  • Use filtering at the dataset creation stage to limit extract size.
  • Document the dataset’s intended use and refresh strategy.
  • Use naming conventions to indicate refresh frequency (e.g. Sales_Extract_Daily).

Where Extracted Mode Shines: Key Use Cases
The benefits of Extracted mode become most apparent in high-demand or constrained environments. Here are several real-world examples where this mode adds tangible value:
  • Executive and Board-Level Dashboards: These consumers demand instant insights. Extracted mode ensures consistent load times without reliance on backend performance.
  • Training and Demo Environments: Great for isolated setups where live connections to backend systems are not possible or reliable.
  • High-Concurrency Reporting: Shared content accessed by many users can overwhelm live systems — extracting the data removes that risk.
  • Agile Development and Prototyping: Teams can iterate quickly without introducing noise into production systems or waiting for slow queries.
  • Hybrid Scenarios: Combine Extracted mode for stable reference data with Live mode for transactional data to strike the right balance.

Best Practices for Extracted Mode
To ensure you get the best results from Extracted mode, consider these best practices:
  • Right-Size Your Extracts: Avoid pulling unnecessary detail — summarised data is more performant and easier to maintain.
  • Monitor Storage and Growth: Keep an eye on storage usage and growth trends, especially in environments with many datasets.
  • Align Schedules with Business Needs: Overly frequent refreshes can add unnecessary load, while infrequent ones may risk data staleness.
  • Establish Ownership: Assign responsibility for refresh schedules and storage oversight.
  • Test Before Deployment: Validate dataset size, refresh time, and dashboard performance before promoting into production.
​Final Thoughts
The introduction of Extracted mode in Oracle Analytics Cloud marks a significant step forward in how practitioners can balance data freshness, performance, and scalability. By providing a fully materialised, high-speed dataset layer within OAC, this new mode empowers teams to deliver faster, more consistent user experiences without overloading backend systems.
It’s not a silver bullet — and it won’t replace Live mode where real-time data is needed — but for many scenarios, particularly those requiring speed and stability, Extracted mode is a smart and strategic choice.
With Oracle continuing to invest in features that improve accessibility, manageability, and user experience, this latest enhancement underlines the platform’s commitment to evolving enterprise analytics.
0 Comments

Optimising Data Strategy for AI and Analytics in Oracle ADW: Reducing Storage Costs with Silk Echo

10/3/2025

0 Comments

 
Picture
The Growing Challenge of Data Duplication in AI and Analytics

As enterprises increasingly adopt AI-driven analytics, the demand for efficient data access continues to rise. Oracle Autonomous Data Warehouse (ADW) is a powerful platform for analytical workloads, but AI-enhanced processes—such as Agentic AI, Retrieval-Augmented Generation (RAG), and predictive modelling—place new strains on data management strategies.

A key issue in supporting these AI workloads is the need for multiple data copies, which drive up storage costs and operational complexity. Traditional approaches to data replication no longer align with the scale and agility required for modern AI applications, forcing organisations to rethink how they manage, store, and access critical business data.

This blog builds upon my previous post on AI Agents in the Oracle Analytics Ecosystem, further exploring how AI-driven workloads impact traditional data strategies and how organisations can modernise their approach.
Picture
​Why AI Workloads Demand More Data

​
AI models, particularly those leveraging RAG, generative AI, and deep learning, require constant access to vast amounts of data. In Oracle ADW environments, these workloads often involve:
  • Agentic AI and RAG: Continually retrieving and processing real-time or near-real-time data for enhanced decision-making, requiring multiple indexed views of the same dataset.
  • Predictive Analytics: Running machine learning models that require extensive historical data for training and inference, often necessitating multiple snapshots of production data.
  • Natural Language Processing (NLP): Extracting insights from unstructured data, requiring large-scale indexing, vector search capabilities, and duplication of processed text corpora.
  • AI-Driven Data Enrichment: Merging structured and unstructured data sources to generate deeper insights, often leading to multiple temporary and persistent data copies.
  • AI Model Testing & Validation: Deploying and fine-tuning AI models across different datasets requires isolated environments, each consuming additional storage resources.
Picture
IDC has extensively documented the exponential growth of data and AI investments. Recent industry reports indicate that data storage requirements for AI workloads are expanding at an unprecedented rate.
​
IDC’s broader research reveals several critical insights about AI’s accelerating impact on data ecosystems:
​
  1. Global Datasphere Growth: IDC forecasts the global datasphere will reach 394 zettabytes by 2028 (up from 149ZB in 2024), representing a 19.4% compound annual growth rate [11][19]. While this encompasses all data, AI workloads are a primary driver: - 90ZB of data will be generated by IoT devices by 2025, much of it processed by AI systems [2][19].- Real-time data (crucial for AI) will grow to 30% of all data by 2025, up from 15% in 2017[6].
  2. AI-Specific Infrastructure Demands - Spending on AI-supporting technologies will reach $337B in 2025, doubling from 2024 levels, with enterprises allocating 67% of this budget to embedding AI into core operations [3][8]. - AI servers and related infrastructure are growing at 29-35% CAGR, outpacing general IT spending [15][17].
  3. Generative AI Acceleration IDC predicts Gen AI adoption will drive $1T in productivity gains by 2026, with 35% of enterprises using Gen AI for product development by 2025[4][18]. This requires massive data processing: - Cloud platform services supporting AI workloads are growing at >50% CAGR [5]. - AI-optimised PCs will comprise 60% of all shipments by 2027, enabling localised data processing [20].- Enterprise AI spending doubling from $120B (2022) to $227B (2025) in the U.S. alone[1][3]. - Gen AI spending projected to reach $202B by 2028, representing 32% of total AI investments [8].

The data explosion is being fuelled by AI use cases like augmented customer service (+30% CAGR), fraud detection systems (+35.8% CAGR), and IoT analytics [1][8]. IDC emphasizes that 90% of new enterprise apps will embed AI by 2026, ensuring continued exponential data growth at the intersection of AI adoption and digital transformation [9][12].
Picture
AI data volumes are projected to increase significantly, posing challenges for enterprises striving to maintain scalable and cost-efficient storage solutions. Without proactive measures, organisations risk soaring expenses and performance limitations that could stifle innovation.

Sources
[1] Spending on AI Solutions Will Double in the US by 2025, Says IDC https://www.bigdatawire.com/this-just-in/spending-on-ai-solutions-will-double-in-the-us-by-2025-says-idc/
[2] IDC: Expect 175 zettabytes of data worldwide by 2025 - Network World https://www.networkworld.com/article/966746/idc-expect-175-zettabytes-of-data-worldwide-by-2025.html
[3] IDC Unveils 2025 FutureScapes: Worldwide IT Industry Predictions https://www.idc.com/getdoc.jsp?containerId=prUS52691924
[4] IDC Predicts Gen AI-Powered Skills Development Will Drive $1 Trillion in Productivity Gains by 2026 https://www.idc.com/getdoc.jsp?containerId=prMETA51503023
[5] AI consumption to drive enterprise cloud spending spree - CIO Dive https://www.ciodive.com/news/cloud-spend-doubles-generative-ai-platform-services/722830/
[6] Data Age 2025: - Seagate Technology https://www.seagate.com/files/www-content/our-story/trends/files/Seagate-WP-DataAge2025-March-2017.pdf
[7] IDC Predicts Gen AI-Powered Skills Development Will Drive $1 Trillion in Productivity Gains by 2026 https://www.channel-impact.com/idc-predicts-genai-powered-skills-development-will-drive-1-trillion-in-productivity-gains-by-2026/
[8] Worldwide Spending on Artificial Intelligence Forecast to Reach $632 Billion in 2028, According to a New IDC Spending Guide https://www.idc.com/getdoc.jsp?containerId=prUS52530724
[9] Time to Make the AI Pivot: Experimenting Forever Isn’t an Option https://blogs.idc.com/2024/08/23/time-to-make-the-ai-pivot-experimenting-forever-isnt-an-option/
[10] How real-world businesses are transforming with AI - with 50 new stories https://blogs.microsoft.com/blog/2025/02/05/https-blogs-microsoft-com-blog-2024-11-12-how-real-world-businesses-are-transforming-with-ai/
[11] Data growth worldwide 2010-2028 - Statista https://www.statista.com/statistics/871513/worldwide-data-created/
[12] IDC and IBM lists best practices for scaling AI as investments set to double https://www.ibm.com/blog/idc-and-ibm-list-best-practices-for-scaling-ai-as-investments-set-to-double/
[13] Nearly All Big Data Ignored, IDC Says - InformationWeek https://www.informationweek.com/machine-learning-ai/nearly-all-big-data-ignored-idc-says


The Traditional Approach: Cloning Production Data

​Historically, organisations have relied on full database cloning to create isolated environments for AI training, model validation, and analytics. While this approach ensures data consistency, it comes with significant drawbacks:
  • Storage Overhead: Each cloned copy requires additional storage, leading to exponential growth in consumption and costs. For organisations processing terabytes or petabytes of data, this rapidly becomes unsustainable.
  • Data Staleness: Cloned datasets quickly become outdated, requiring frequent refreshes that consume computing resources and delay AI-driven insights.
  • Operational Complexity: Managing multiple cloned copies increases administrative overhead, creating challenges in data governance, version control, and compliance.
  • Performance Bottlenecks: As AI models interact with production or cloned datasets, increasing query loads can degrade performance, slowing down analytics and decision-making.
  • Security & Compliance Risks: More data copies mean more potential points of exposure, increasing the risk of non-compliance with regulations such as GDPR, CCPA, and industry-specific mandates.
Picture
​Cost Implications of Traditional Data Cloning

To put this into perspective, consider a mid-sized enterprise running an Oracle Autonomous Data Warehouse (ADW) instance with 50TB of data. If multiple teams require their own clones for model training and testing, the storage footprint could easily reach 250TB or more. With cloud storage costs averaging £0.02 per GB per month, this could result in annual expenses exceeding £60,000—just for storage alone. Factor in compute, additional database costs and administrative overhead, and the financial impact becomes even more pronounced.

The challenge becomes particularly acute when considering the unique characteristics of AI workloads. Traditional RDBMS architectures were designed for transactional processing and structured analytical queries, but AI workflows introduce several distinct pressures:

Data Transformation Requirements: Machine learning models often require multiple transformations of the same dataset for feature engineering, resulting in numerous intermediate tables and views. These transformations must be stored and versioned, further multiplying storage requirements.

Concurrent Access Patterns: AI training workflows typically involve intensive parallel read operations across large datasets, which can overwhelm traditional buffer pools and I/O subsystems designed for mixed read/write workloads. This often leads to performance degradation for other database users.

Version Control and Reproducibility: ML teams need to maintain multiple versions of datasets for experiment tracking and model reproducibility. Traditional RDBMS systems lack native support for dataset versioning, forcing teams to create full copies or implement complex versioning schemes at the application level.

Query Complexity: AI feature engineering often involves complex transformations that push the boundaries of SQL optimisation. Operations like window functions, recursive CTEs, and large-scale joins can strain query optimisers designed for traditional business intelligence workloads.
​
Resource Isolation: When multiple data science teams share the same RDBMS instance, their resource-intensive operations can interfere with each other and with production workloads. Traditional resource governors and workload management tools may not effectively handle the bursty nature of AI workloads.
Picture
Additionally, the need for data freshness adds another layer of complexity. Teams often require recent production data for model training, leading to regular refresh cycles of these large datasets. This creates significant network traffic and puts additional strain on production systems during clone or backup operations.

To address these challenges, organisations are increasingly exploring alternatives such as:
​
  1. Data virtualisation and zero-copy cloning technologies
  2. Purpose-built ML feature stores with versioning capabilities
  3. Hybrid architectures that offload AI workloads to specialised platforms
  4. Automated data lifecycle management to control storage costs
  5. Implementation of data fabric architectures that provide unified access whilst maintaining physical separation of AI and operational workloads
Picture
The financial implications extend beyond direct storage costs. Organisations must consider:
​
  • Additional licensing costs for database features required to support AI workflows
  • Network egress charges for data movement between environments
  • Increased operational complexity and associated staffing costs
  • Potential performance impact on production systems
  • Compliance and security overhead for managing sensitive data across multiple environments

As AI workloads continue to grow, organisations need to carefully evaluate their data architecture strategy to ensure it can scale sustainably whilst maintaining performance and cost efficiency.

To overcome these challenges, organisations need a solution that optimises storage usage while maintaining seamless access to real-time data. Silk Echo is a powerful tool for optimising database replication in cloud environments. It offers a range of features that improve performance, simplify management, and enhance the resiliency of data infrastructure.

Silk Echo enables virtualised, lightweight data replication. Instead of creating full physical copies of datasets, it provides near-instantaneous, space-efficient snapshots that eliminate unnecessary duplication.

Introducing Silk Echo: A Smarter Approach to AI Data Management

Silk Echo addresses the challenge of data duplication by providing a high-performance virtualised storage layer. Instead of physically copying data into multiple environments, Silk Echo allows AI workloads, data warehouses, and vector databases to operate on a single logical copy. This reduces unnecessary duplication while maintaining high-speed access to data.

How Silk Echo Works

Virtualised Data Access – Silk Echo enables AI workloads to access data stored in Oracle ADW and other environments without requiring full duplication.

High-Performance Caching – Frequently accessed AI data is cached efficiently to provide rapid query performance.

Seamless Integration – Silk Echo integrates with Oracle ADW, vector databases, and AI model pipelines, reducing the need for repeated ETL processes.

Cost Optimisation – By eliminating redundant data copies, organisations can significantly cut down on storage costs while maintaining AI performance.
​
Silk Echo represents a shift in how enterprises approach AI and data management, ensuring that AI workloads remain cost-efficient, scalable, and manageable within Oracle ADW environments. The next step is to explore how Silk Echo integrates with specific Oracle AI use cases.
Picture
Key Benefits of Silk Echo for Oracle ADW and AI Workloads

Products like Silk’s Echo offering, provide a number of benefits to the RDBMS architecture that enable the efficient cost-effective support of modern AI workloads. Some of these benefits are:
​
  • Storage Optimisation: Eliminates redundant data copies, reducing storage consumption by up to 80% and significantly lowering costs.
  • Real-Time Data Access: Ensures AI models always work with the most up-to-date information, reducing the lag introduced by traditional cloning processes.
  • Accelerated AI & Analytics Workflows: Removes bottlenecks associated with traditional cloning, improving overall data pipeline efficiency.
  • Enhanced Data Governance & Security: Reduces data sprawl, helping organisations maintain compliance and security standards with minimal administrative burden.
  • Faster AI Model Development & Deployment: Enables AI teams to test and validate models with up-to-date snapshots instead of relying on costly, static cloned environments.
Picture
​Future-Proofing Oracle ADW and Oracle Analytics for AI Workloads

The rapid evolution of AI and analytics demands that organisations build future-proof architectures that can scale with new workloads. Silk Echo plays a crucial role in this by:
  • Enabling AI-Ready Data Architectures: With Silk Echo, Oracle ADW can handle the increasing demands of AI-driven analytics without compromising performance or cost efficiency.
  • Supporting AI Innovations: As AI models become more sophisticated, they will require dynamic and optimised access to real-time data. Silk Echo ensures that models always have the freshest data available.
  • Ensuring Long-Term Cost Efficiency: By minimising unnecessary data replication, Silk Echo provides a sustainable cost model that allows organisations to allocate resources more effectively to AI initiatives.
  • Enhancing Data Virtualisation Capabilities: The ability to create lightweight, instant extracts means organisations can easily integrate Oracle Analytics with broader AI ecosystems, improving analytical outcomes.

The Future of AI and Analytics in Oracle ADW

As AI adoption grows, businesses must rethink their data strategies to balance performance, cost, and scalability. By leveraging Silk Echo in Oracle ADW environments, organisations can:
  • Reduce the financial burden of storage-intensive AI processes.
  • Ensure AI-driven applications operate with real-time, accurate data.
  • Improve compliance and governance without slowing down innovation.
  • Scale AI and analytics workloads without excessive data duplication.

Are You Ready to Optimise Your AI-Driven Analytics in Oracle ADW?

By adopting next-generation storage solutions like Silk Echo, organisations can unlock the full potential of AI while keeping costs under control. Investing in efficient data management strategies today will ensure businesses remain competitive in the AI-driven future.
0 Comments

The Hidden Cost of Agentic AI: The Production Database Bottleneck

28/2/2025

0 Comments

 
Picture
The rise of Agentic AI is transforming the analytics landscape, but it comes with an often-overlooked challenge: database strain. Traditionally, operational databases are ringfenced to prevent unstructured, inefficient queries from affecting critical business functions. However, in a world where AI agents dynamically generate and execute SQL queries to retrieve real-time data, production databases are facing unprecedented pressure.
​
Additionally, Retrieval-Augmented Generation (RAG), a rapidly emerging AI technique that enhances responses with real-time data, is further intensifying this issue by demanding continuous access to up-to-date information. RAG works by supplementing AI-generated responses with live or external knowledge sources, requiring frequent, real-time queries to ensure accuracy. This puts even more strain on traditional database infrastructures. In a previous blog post, I looked at how Agentic AI will improve the experience for users of the Oracle Analytics ecosystem.
Picture
This blog explores the risks of this architectural shift where AI Agents are in opposition with the traditional RDBMS architecture, why traditional solutions such as database cloning fall short, and how modern data architectures like data lakehouses and innovative storage solutions can help mitigate these challenges. Additionally, we examine the implications for the Oracle Analytics Platform, where these changes could impact both data accessibility and performance.
Picture
The Problem: AI Agents, RAG & Uncontrolled Query Load

A well-managed production database is typically shielded from unpredictable query loads. Database administrators ensure that only structured, optimised workloads access production systems to avoid performance degradation. But with Agentic AI and RAG, that fundamental principle is breaking down.
Instead of a few human analysts running queries, organisations may now have dozens or even hundreds of AI agents autonomously executing SQL queries in real time. These queries are often:

  • Ad hoc and unpredictable, making performance tuning difficult
  • Highly frequent, since AI agents are designed to work autonomously
  • Complex and computationally expensive, often scanning large datasets

This creates significant challenges for traditional RDBMS architectures, which were not designed to handle the scale and unpredictability of AI-driven workloads. With Retrieval-Augmented Generation (RAG) in particular, AI models require frequent access to real-time data to enhance their outputs, placing additional stress on transactional databases. Since these databases were optimised for structured queries and controlled access, the introduction of AI-driven workloads risks causing slowdowns, performance degradation, and even system failures.

For users of Oracle Analytics, this shift presents serious performance implications. If production databases are overwhelmed by AI-driven queries, query response times increase, dashboards lag, and real-time insights become unreliable. Additionally, Oracle Analytics’ AI Assistant, Contextual Insights, and Auto Insights features, which rely on efficient access to data sources, could suffer from delays or inaccuracies due to excessive load on transactional systems.

To mitigate this, organisations must rethink their database strategies, ensuring that AI workloads are governed, optimised, and properly distributed across more scalable architectures.

The Traditional Approach: Cloning Production Data

One way that organisations have attempted to address this issue is by cloning production databases on a daily or weekly basis to offload AI-driven queries. However, this approach presents several major drawbacks:

  • Clones are not real-time – Since they are refreshed periodically, they lack the latest data required for up-to-the-minute AI-driven insights.
  • Cloud clones are expensive – Storage and compute costs for cloud-based clones can be prohibitively high, making large-scale adoption unsustainable.
  • Complexity & Latency – Maintaining and synchronising clones across multiple environments increases operational overhead and slows down analytics workflows.
  • Doesn’t scale with RAG demands – Since RAG requires real-time retrieval of data to generate accurate responses, stale clones fail to meet its needs.
For AI-driven analytics, relying on periodic database clones leads to significant inefficiencies. Since AI agents require access to up-to-date, contextual information, working with outdated data introduces data integrity risks and lowers the effectiveness of AI-generated insights. Additionally, managing and maintaining multiple clones across different environments adds a significant administrative burden, requiring additional governance, access control, and monitoring.

For Oracle Analytics users, these challenges could lead to outdated insights, reduced trust in AI-generated recommendations, and a poor user experience due to lagging or inconsistent data. Given these drawbacks, it’s clear that cloning is not a viable long-term solution for handling the database demands of Agentic AI.
Picture
A Shift in Data Architecture: Data Lakes & Lakehouses

Instead of relying on traditional RDBMS architectures, organisations are increasingly adopting data lakes and lakehouses to support AI-driven analytics. These architectures offer several key advantages:

  • Decoupled storage and compute – Unlike traditional databases, lakehouses allow AI agents to access structured and unstructured data without directly querying transactional systems.
  • Scalability – Cloud-based lakehouses scale seamlessly with AI workloads, reducing the risk of database bottlenecks.
  • Cost-effectiveness – By storing data in low-cost cloud object storage, lakehouses significantly reduce the expenses associated with cloning full databases.
However, while lakehouses present an effective alternative, the migration from a traditional RDBMS to a lakehouse architecture is not without its challenges. The shift requires significant investment in re-architecting data pipelines, ensuring data governance, and retraining teams to work with new query engines and tools. Moreover, performance tuning for structured transactional data in a lakehouse can be complex compared to optimised RDBMS queries, which may lead to initial adoption friction.
​
For users of Oracle Analytics, this shift could mean that existing reports and dashboards need to be refactored to work efficiently with a lakehouse structure, adding additional effort and complexity.

Optimising Performance with Modern Storage Solutions

Beyond adopting new architectural patterns, organisations can leverage modern storage solutions like Silk to mitigate the strain on production databases. Silk provides a virtualised, high-performance data layer that optimises storage performance and scalability without requiring a complete architectural overhaul.
By using Silk or similar intelligent storage virtualisation and caching technologies, organisations can:

  • Provide instant, efficient data replication – Unlike traditional cloud-based clones, modern solutions can create real-time, low-overhead replicas, ensuring fresh data access without excessive costs.
  • Optimise query performance – Intelligent caching and virtualisation technologies help ensure AI workloads do not compromise the performance of mission-critical systems.
  • Reduce the complexity of migrating from an RDBMS – By providing compatibility with existing RDBMS environments, these solutions offer a more gradual and manageable transition to modern data architectures.
Picture
For organisations using Oracle Analytics, integrating such solutions could help sustain real-time data access while alleviating the performance burden on production databases. However, despite these advantages, storage virtualisation and caching solutions are not a panacea. Organisations must still ensure that their AI workloads are properly governed to prevent excessive resource consumption, and they need to assess whether virtualised storage aligns with their broader data architecture and security policies.

Conclusion: Preparing for the Future of AI-Driven Analytics

Agentic AI and RAG are here to stay, and with them comes a fundamental shift in how data is accessed and managed. However, blindly allowing AI-driven queries to run against production databases is not a sustainable solution. To support the evolving demands of AI, organisations must modernise their data strategies by:

  • Shifting AI workloads to scalable architectures like data lakehouses
  • Implementing AI query governance to optimise performance
  • Leveraging modern storage technologies like Silk to mitigate traditional RDBMS bottlenecks

For Oracle Analytics users, this shift will require rethinking how data is stored, accessed, and processed to ensure that the platform continues to deliver timely insights without compromising performance. The key takeaway? Traditional database architectures were not designed for AI-driven workloads. To fully embrace the potential of Agentic AI and RAG, organisations must rethink their data foundations - or risk being left behind.

How is your organisation adapting to the challenges of AI-driven analytics? Let’s continue the conversation in the comments!
0 Comments

Is AI Making SQL Redundant? The Evolving Role of SQL in Oracle Analytics

14/2/2025

2 Comments

 
Picture
Introduction

Structured Query Language (SQL) has been the backbone of analytics for decades, enabling users to extract, transform, and analyse data efficiently. However, with the rise of AI-driven analytics, features like AI Assistant, Auto Insights, and Contextual Insights are allowing users to interact with data without writing SQL.

Does this mean SQL is becoming redundant? The answer isn’t a simple yes or no. AI is certainly abstracting SQL from business users, making analytics more accessible, but SQL still plays a critical role behind the scenes. This blog explores how AI is changing SQL’s role, where SQL is still essential, and what the future might hold.


Picture
How AI is Reducing SQL’s Visibility in Oracle Analytics

AI-powered features in Oracle Analytics allow users to explore data without manually writing SQL. Three key capabilities demonstrate this shift:

1. Contextual Insights: Auto-Generated SQL Behind the Scenes
• Example: A sales manager sees an unexpected spike in revenue on a dashboard. Instead of running queries, they click on the data point, and Contextual Insights automatically surfaces key drivers and trends.
• What happens in the background? Oracle Analytics generates SQL queries to identify correlations, anomalies, and patterns, but the user never sees or writes them.

2. AI Assistant: Querying Data Without SQL
• Example: A marketing analyst wants to compare Q1 and Q2 campaign performance. Instead of writing a SQL query, they ask the AI Assistant:
“Show me campaign revenue for Q1 vs Q2.”
• What happens in the background? The AI Assistant translates the request into SQL, retrieves the data, and presents a visual answer.
• Why it matters: Business users get answers instantly, without needing SQL expertise.

3. Auto Insights: Surfacing Trends Without Querying Data
• Example: A finance team wants to understand profit fluctuations. Instead of manually querying revenue data over time, they use Auto Insights, which highlights key trends and anomalies.
• What happens in the background? Oracle Analytics runs SQL queries to detect significant changes and patterns.
​
These features make SQL less visible but not obsolete. In fact, AI relies on SQL to function effectively, which leads to the question—where is SQL still essential?
Picture
Why SQL is Still Essential

While AI is making SQL more accessible, it hasn’t eliminated the need for SQL expertise. Several areas still require manual intervention:

1. Handling Complex Joins & Business Logic
• AI struggles with complex queries that involve multiple joins, subqueries, and conditional logic.
• Example: A financial analyst wants to calculate profitability by region, requiring a multi-table join across sales, inventory, and expenses. AI might generate an inefficient or incorrect query.

2. Performance Optimisation
• AI-generated SQL isn’t always the most efficient. SQL tuning (e.g., indexing, partitioning) still requires human expertise.
• Example: AI might generate a query that performs a full table scan instead of leveraging an index, slowing down performance.
​
3. Explainability & Trust
• AI-generated queries can sometimes produce unexpected results, making it difficult for users to validate the logic.
• Example: If AI Assistant returns an unusual data trend, an analyst may still need to inspect the underlying SQL to ensure accuracy.

SQL remains a crucial tool for data engineers, analysts, and DBAs who need control over data processing, query performance, and governance. However, as AI continues to evolve, could it overcome these challenges?

The Role of the Semantic Model in AI-Driven Analytics

One of the key features of Oracle Analytics is its semantic model, designed to abstract the complexity of source systems from end users. Instead of writing raw SQL queries against complex database structures, users interact with a logical layer that simplifies relationships, calculations, and security rules.
Picture
Why the Semantic Model Exists

The semantic model serves several purposes, including:

• Hierarchies & Drilldowns: Defining business hierarchies (e.g., Year → Quarter → Month) for intuitive analysis.
• Logical Joins & Business Logic: Providing a structured way to join tables without requiring users to understand foreign keys or database relationships.
• Row-Level Security: Enforcing access control so users only see the data they are authorised to view.

This abstraction enables self-service analytics while ensuring data governance, performance, and accuracy.
Will AI Make the Semantic Model Redundant?

AI-powered analytics features in Oracle Analytics Cloud (OAC)—such as Contextual Insights, AI Assistant, and Auto Insights—are reducing the need for manual query writing. But does this mean the semantic model is no longer needed? Not quite.
Picture
AI currently relies on the semantic model to:

• Ensure accurate and governed data access—AI cannot enforce security rules or business logic without a structured data layer.
• Interpret user queries correctly—When an AI Assistant generates SQL, it uses predefined joins and relationships from the semantic model.
• Maintain consistency—Without a semantic layer, different AI-generated queries might return inconsistent results due to varying assumptions about data relationships.

The Future: AI-Augmented Semantic Models?
​

Rather than replacing the semantic model, AI could enhance it by:

• Auto-generating relationships & joins based on data patterns.
• Improving performance optimisation, recommending indexing strategies or pre-aggregations.
• Enhancing explainability, showing why certain joins or hierarchies were applied.

AI and the Semantic Model Will Coexist

While AI reduces the need for manual SQL, the semantic model remains essential for structured, governed, and performant analytics. The future is likely AI-assisted semantic models rather than their elimination.
The Future of AI in SQL GenerationAI will likely become more sophisticated in handling SQL, but rather than eliminating it, AI will enhance SQL’s role. Here’s what the future might look like:

1. AI-Powered Query Optimisation
• AI could not only generate SQL but also analyse and optimise it for better performance.
• Example: Future AI Assistants might suggest indexing strategies, rewrite inefficient queries, or recommend materialised views.
2. Better Handling of Complex Joins & Business Logic
• AI could integrate knowledge graphs or semantic layers to better understand relationships between tables, improving the accuracy of generated SQL.
3. Explainable AI for SQL Generation
• AI might offer query rationale explanations, showing users why a specific query was generated and suggesting alternative approaches.
4. AI Agents & Autonomous Databases
• AI Agents could work alongside SQL experts, automating routine queries while letting humans handle complex cases.
• Oracle’s Autonomous Database could play a larger role in self-optimising SQL execution.
​While AI will continue to reduce the need for manual SQL writing, it is more likely to enhance SQL rather than replace it.
Picture
Final Thoughts: Adapting to the AI-Driven Analytics LandscapeAI is shifting SQL from a tool business users interact with directly to something that powers insights in the background. However, this doesn’t mean SQL is going away. Instead:

• Business users will rely more on AI-driven insights without needing SQL knowledge.
• Data engineers and analysts will still need SQL expertise to optimise performance, manage governance, and handle complex queries.
• The future is AI-Augmented SQL, not SQL-Free Analytics.

For professionals in the analytics space, this means embracing AI while still sharpening SQL skills. AI will make SQL more powerful, but those who understand both will be best positioned to leverage the full potential of Oracle Analytics.
​
What Do You Think?

Do you see AI replacing SQL in your analytics work, or do you think SQL will remain a core skill? Let’s discuss in the comments!

Next Steps

• If you’re interested in seeing these AI-driven analytics features in action, explore Oracle Analytics’ AI Assistant, Auto Insights, and Contextual Insights.
• Stay tuned for more insights on AI’s role in modern analytics on Elffar Analytics.
2 Comments
<<Previous

    Author

    A bit about me. I am an Oracle ACE Pro, Oracle Cloud Infrastructure 2023 Enterprise Analytics Professional, Oracle Cloud Fusion Analytics Warehouse 2023 Certified Implementation Professional, Oracle Cloud Platform Enterprise Analytics 2022 Certified Professional, Oracle Cloud Platform Enterprise Analytics 2019 Certified Associate and a certified OBIEE 11g implementation specialist.

    Archives

    July 2025
    June 2025
    May 2025
    March 2025
    February 2025
    January 2025
    December 2024
    November 2024
    September 2024
    July 2024
    May 2024
    April 2024
    March 2024
    January 2024
    December 2023
    November 2023
    September 2023
    August 2023
    July 2023
    September 2022
    December 2020
    November 2020
    July 2020
    May 2020
    March 2020
    February 2020
    December 2019
    August 2019
    June 2019
    February 2019
    January 2019
    December 2018
    August 2018
    May 2018
    December 2017
    November 2016
    December 2015
    November 2015
    October 2015

    Categories

    All
    ADW
    AI
    FDI
    OAC
    OAS
    OBIEE
    OBIEE 12c

    RSS Feed

    View my profile on LinkedIn
Powered by Create your own unique website with customizable templates.
  • Home
  • Blog