Skip to main content

Power BI Dataflows in Catalog

You'll learn how Catalog uses Power BI Dataflows when it builds lineage from reports and data sets back to warehouse tables. You'll also see how standard Power BI reports and dashboards compare to paginated reports for lineage depth, what Catalog stores for each Dataflow, how to open and read that lineage in Catalog, how to find a Dataflow when search or navigation feels unclear, which M patterns Catalog recognizes, what to expect for freshness and schedules, and what you need in place when lineage looks incomplete or stale.

What Is a Power BI Dataflow?​

A Power BI Dataflow is an optional data preparation layer in Power BI. Dataflows use Power Query M to connect to sources, shape data, and publish entities that data sets and other Dataflows can reuse. In many architectures, data moves from a warehouse into a Dataflow, then from the Dataflow into one or more data sets, then into reports and dashboards.

Microsoft's overview explains concepts and authoring: Introduction to Dataflows.

Which Power BI Assets Connect to Warehouse Lineage​

Use this section to see which kinds of Power BI content Catalog is built to trace deeply through semantic data sets and related metadata, and where expectations should be lower. Catalog ties BI assets to warehouse objects your integrations have already synced; the shape of Microsoft’s APIs and the asset type both influence how complete that graph is.

AssetRole in lineageWhat to expect
Semantic data setsModels that store imported or DirectQuery tables and relationships Power BI reports consume.Primary path from warehouse tables into Power BI. Strong table-level and column-level lineage when warehouse objects exist in Catalog, admin APIs expose metadata, and models stay refreshed as described in Power BI.
Standard Power BI reportsInteractive reports authored against semantic data sets you build in Power BI Desktop or the Power BI service.Catalog treats these as the main report type for dependency graphs: upstream through the linked data set toward warehouse tables when that link exists in ingested metadata.
DashboardsCollections of tiles and pinned visuals that reference underlying reports and visuals.Lineage flows through those references into reports and data sets as Microsoft’s metadata and Catalog ingestion allow.
Power BI DataflowsOptional Power Query layers that publish entities other Dataflows and data sets reuse.Covered end to end on this page: entity resolution, M patterns, and limits are in the sections below.
Paginated reportsPrint-oriented or operational layouts, often built in Report Builder and saved as .rdl, backed by a different reporting model than typical interactive Power BI reports.Paginated reports use a different reporting stack; the metadata available for them does not support the same table-level and column-level resolution. Field-level detail in Catalog is usually lighter as well. For regulatory or document-style outputs, validate those assets directly rather than assuming the same depth you get when you trace interactive reports through data sets and Dataflows to the warehouse.

When you're validating lineage for a specific asset, confirm whether it is an interactive report on a semantic data set or a paginated report. Mixed expectations across those types are a common reason two teams see different Catalog depth for β€œreports” in Power BI.

Paginated URLs and Catalog ingestion

Some product surfaces can recognize Power BI links whose paths include paginated report locations. That only means the URL is parsed as a Power BI resource. It does not change how Catalog ingests metadata or builds lineage for paginated reports relative to standard reports.

For credentials, admin APIs, extraction schedules, and environment-specific behavior, use Power BI as the setup reference alongside this Dataflows guide.

How Dataflows Relate to Lineage in Catalog​

Catalog maintains a metadata graph that links BI assets to warehouse objects your integrations have already synced into Catalog. When a data set table loads from a Dataflow, that Dataflow is an extra step between the data set and the warehouse. If that step isn't represented in the metadata Catalog ingests from Power BI, lineage can stop at the data set even when warehouse sync is healthy.

Catalog processes Dataflows as part of Power BI lineage so asset-level and column-level lineage can run through the Dataflow layer when the underlying metadata is available and upstream objects exist in Catalog. You don't turn on a separate Dataflows option in Catalog. The same Power BI credentials, Admin API Settings, and extraction runs cover Dataflow metadata when your tenant and models expose it.

Lineage quality still depends on both sides of the graph:

  • Power BI - Admin settings, refresh or republish behavior, and what the Power BI APIs return for mashup and related metadata. See Power BI for details.
  • Warehouse and other sources - Tables and columns must be present in Catalog through your warehouse integration or another integration. If Catalog doesn't know about a database or connection your Dataflow uses, lineage can be incomplete even when Power BI extraction succeeds.

The following sections explain how that Dataflow layer is recorded, how to open it in Catalog, how to find Dataflows in search, what to expect for sync cadence, and how to interpret supported M patterns next to reports and warehouse tables.

What Catalog Stores for Power BI Dataflows​

Catalog ingests each Power BI Dataflow as a Power BI visualization model, in the same family as a semantic data set. Catalog distinguishes a Dataflow from a semantic data set for lineage and storage. You see both types in the same Power BI areas of Catalog, for example Dashboards and search results for visualization models, using the names Power BI assigns in your tenant.

For each Dataflow, Catalog stores entities as tables on that model. For each entity, Catalog keeps the entity name, the M query text, resolved paths to warehouse tables when the M reads the warehouse, and links between entities when one entity's M references another Power BI table in the same flow, a pattern typical for linked entities. Catalog uses those internal entity links when it resolves lineage all the way to the warehouse, including for column-level lineage when M and metadata are clear enough.

When a semantic data set loads from a Dataflow, its M can use the Power BI Dataflows or Power Platform Dataflows connectors. Catalog parses workspace, Dataflow ID, and entity name from that M so the data set connects to the right flow and entity. Both identifiers need to be present in the shape Catalog expects, or the Dataflow reference does not resolve from that M alone.

The semantic data set connects to a specific flow and entity when its M uses a supported Power BI Dataflows or Power Platform Dataflows path, as described in Supported Patterns and Limits.

Column-level lineage builds on the same graph. It can be weaker than table-level lineage when an entity has several warehouse sources in one logical table, when M is heavily parameterized or indirect, or when field mappings are ambiguous. Some rename patterns in M resolve more reliably than others; see Troubleshoot Power BI Lineage When Columns Are Renamed for detail.

For measures you define in DAX, field lineage in Catalog shows relationships in the graph when paths resolve. Read or edit the full measure formula in Power BI Desktop or your standard authoring tools. Catalog field lineage and lineage detail panels focus on the dependency graph rather than reproducing that formula text. For more detail, see Measures and DAX in Troubleshoot Power BI Lineage When Columns Are Renamed.

Lineage in the Catalog UI​

Catalog shows lineage between reports, visualization models, and warehouse tables in the same lineage graph as other assets. Dataflows and semantic data sets both count as visualization models. The graph can include intermediate steps, for example warehouse to Dataflow entity to semantic data set, even when you focus on an end-to-end question.

Use this sequence when you want to inspect Dataflow-backed lineage:

  1. Open the Power BI asset in Catalog from Dashboards in the left navigation or from Catalog search. Quick and advanced search both include visualization models; Power BI Dataflows appear alongside Power BI data sets in that visualization model family. Open the visualization model whose name matches the Dataflow or semantic data set in Power BI, or open a report or dashboard that depends on that model. Then select the Lineage tab on that asset.

    Power BI report in Catalog with Lineage tab selected, showing Dashboard Lineage and Field Lineage preview cards and buttons to open each graph.
    On a report or dashboard, Lineage shows preview cards for Dashboard Lineage and Field Lineage before you open the full graph.
  2. On a report or dashboard, choose Open Dashboard Lineage to inspect object-level dependencies in the lineage canvas. On a visualization model, the canvas opens from the same Lineage tab without those preview cards. Use + on the left to expand upstream sources and on the right to expand downstream usage, as described in Lineage. When a data set loads from a Dataflow, that Dataflow appears as another visualization model in the graph once metadata resolves.

    Lineage canvas focused on a Power BI report with upstream nodes for a semantic data set and a warehouse view, plus a Details sidebar.
    Dashboard lineage with the report focused; upstream nodes include the semantic data set and warehouse objects.
  3. Select a warehouse table or view in the graph, or open that warehouse asset and its lineage, when you want to confirm how Catalog links warehouse objects into Power BI. Downstream nodes show which visualization models and reports consume the warehouse object.

    Lineage canvas focused on a warehouse view with downstream nodes for a Power BI data set and report.
    Warehouse view focused with downstream semantic data set and report.
  4. For column-level paths, return to the asset Lineage tab and choose Open Field Lineage, or from a warehouse column use Go To Column Lineage when that control is available. Field lineage depends on how clearly Power Query maps fields. Dataflow-backed paths show here when table-level links support column resolution.

    Field lineage graph tracing one column from a warehouse view through a Power BI data set field into a report.
    Field lineage for one column from the warehouse through the semantic data set into the report.

Every time your sources sync into Catalog, lineage is recomputed. The lineage graph uses data from the last 30 days; if you expect a recent change, adjust the time range under the graph if links look stale.

Supported Patterns and Limits​

Catalog anchors Dataflow support on what the Power BI admin APIs return and on M patterns the product parses and tests. Treat the following as the supported surface for data set to Dataflow references in M:

  • Recognized connector entry points - M that navigates through PowerBI.Dataflows or PowerPlatform.Dataflows toward a workspace, Dataflow ID, and entity name.
  • Composite key - Catalog ties a reference to a flow and entity when both the Dataflow identifier and the entity appear in the expected structure. If either is missing or expressed in a way the parser does not recognize, lineage through that Dataflow does not resolve until the model's M matches a supported shape and Power BI exposes matching admin metadata.

Linked entities inside a Dataflow, where one entity is built on another, are part of supported modeling: Catalog follows those internal links when resolving paths to the warehouse.

Plan for the following limits:

  • Ambiguous or multi-source entities - When one entity resolves to multiple warehouse table paths in a way Catalog cannot reduce to a single path, column-level lineage can be incomplete even when table-level links exist.
  • Dynamic or complex M - Parameters, indirection, or unusual connector shapes can weaken lineage until metadata and M align with what extraction returns.
  • Warehouse gaps - If the warehouse tables your Dataflow reads are not in Catalog, the graph stops where Catalog has no upstream object, regardless of Power BI extraction.

Fabric Dataflow behavior follows the same Power BI integration, admin API output, and M patterns described in Supported Patterns and Limits. Use those references when you design validations for Fabric-backed flows.

Performance and Freshness​

Use this section to set expectations for how often Dataflow-backed metadata and lineage update in Catalog. Treat extraction duration, Microsoft API behavior, and lineage recomputation timing as driven by your integration schedules and successful sync outcomes rather than by fixed timing promises in this documentation.

For the first load, Catalog-managed Power BI ingestion can take up to 48 hours for the first sync, as described in Power BI. Treat any Dataflow or data set as potentially absent from search and lineage until that first pass completes successfully.

For ongoing Power BI metadata, after the first sync, Catalog-managed environments follow the schedule you coordinate with Catalog operations. Client-managed environments follow the schedule you configure for castor-extract-powerbi and upload, as in Power BI. After your trial, you schedule extraction at your desired frequency, up to once per day, so Dashboard sections stay current, as described in Data visualization integrations.

For warehouse metadata, warehouse integrations typically sync once per day after the first sync, as described in the Catalog onboarding guide. Lineage through a Dataflow still requires warehouse tables and columns to exist in Catalog, so warehouse freshness and Power BI freshness both matter.

When models or admin settings change, use Power BI to refresh or republish affected data sets, then allow a full Catalog extraction cycle before you judge search results or lineage. Power BI sometimes serves updated mashup and admin metadata shortly after your change, but Catalog only reflects it after the next successful extraction.

How to Validate Dataflow Lineage​

Follow these steps when you want to confirm that lineage crosses a Dataflow path.

  1. Confirm Power BI admin settings
    In the Power BI Admin portal, verify the same Admin API Settings called out in Power BI remain enabled for your Catalog service principal's security group.

  2. Refresh or republish affected data sets
    Follow refresh and republish guidance for data sets in Power BI so lineage-related metadata is current in Power BI before the next Catalog extraction.

  3. Confirm the warehouse side is in Catalog
    Open the warehouse integration documentation for your platform, such as Snowflake, and ensure the databases and objects your Dataflow reads are in scope for sync. If a data set still shows no upstream tables, add or extend the warehouse source that backs those tables, then let Catalog run another sync.

  4. Wait for the next scheduled extraction
    Catalog-managed environments run on a schedule you coordinate with Catalog operations. Client-managed environments use your own schedule for castor-extract-powerbi and upload. See Power BI.

If lineage for Dataflow-backed tables was empty in the past and prerequisites are fixed now, run through the same steps again and allow a full extraction cycle before you open a Support ticket.

Troubleshooting​

For incomplete lineage, missing Dataflows in search, stale metadata after Admin API changes, unrecognized M patterns, and client-managed extraction issues, use Power BI troubleshooting in Catalog. That guide collects step-by-step checks in one place so this article stays focused on how Dataflows work in Catalog.

What's Next?​