Connect to personal and shared OneDrive drives across your organisation. Recursively crawl every folder, extract every supported document, and index it for AI-powered search.
ChunkIQ accesses every OneDrive drive type securely, with no manual configuration per user.
Each user's personal OneDrive for Business drive. Files stored directly in My Files, including nested folders of any depth, are fully crawled and extracted.
β LiveShared drives and document libraries shared with the user. ChunkIQ enumerates all accessible drives and includes them in the extraction run.
β LiveTraverses folders recursively regardless of nesting depth. Captures the full folder path in metadata so results can be filtered by directory in search.
β LiveFiles shared with the authenticated user from other drives. ChunkIQ resolves the remote item references and includes them in the extraction queue.
β LiveChunkIQ traverses folder hierarchies of unlimited depth using efficient delta queries, capturing every file regardless of where it's stored.
Files embedded inside Word, Excel, and PowerPoint documents are extracted and indexed separately, each with lineage metadata back to the parent document.
Uses OneDrive delta tokens to track changes since the last run. Only new, modified, or deleted items are processed β making large drives efficient to keep fresh.
Every chunk records the drive ID, drive type, owner, folder path, file name, file size, and last modified date for precise filtering and attribution.
Processes .pdf, .docx, .xlsx, .pptx, .xlsm, .csv, .json, .txt, and .md files found anywhere in the drive structure.
All data is written to Azure Data Lake Storage Gen2 within your own subscription. Managed identity auth, no external data transfers.
Authenticates via Azure AD with Files.Read.All to access all OneDrive drives in the tenant, including personal and shared drives.
Lists all drives for each user, then recursively enumerates folders and files. Delta tokens are stored for efficient subsequent runs.
Dedicated extractors process each file type. Text is cleaned, split into semantic chunks, and tagged with full drive/folder provenance metadata.
Chunks are vectorised with Azure OpenAI and pushed to Azure AI Search for hybrid BM25 + vector + semantic retrieval.
Personal drives, shared drives, nested folders β all automatically extracted and ready for AI-powered search.