Sources How it works Technology Pricing Support Products πŸ“ SharePoint Extractor πŸ’¬ Teams & Search Portal πŸ““ OneNote Extractor ☁️ OneDrive Extractor ⚑ Processor 🏒 Enterprise ☁️ Enterprise Cloud Book a Demo Sign in
πŸ’¬
Microsoft 365 Connector + Search Portal

Teams & Search Portal

Automatically extract every file shared across Microsoft Teams channels and team sites β€” then search across all indexed content through a built-in Streamlit search portal.

Get started free See the pipeline
Auto
Team discovery
3
Channel types
Hybrid
BM25 + Vector search
100%
Native extraction

Every Teams file surface, covered

ChunkIQ discovers all teams in your tenant and indexes files across all channel types automatically.

πŸ“’

Standard Channels

Files posted or shared in public channels. Automatically maps the underlying SharePoint team site and crawls the Files tab library.

πŸ”’

Private Channels

Private channels with dedicated SharePoint sites. Each is discovered and crawled separately with appropriate permission scopes.

🌐

Shared Channels

Cross-tenant shared channels and their associated document libraries are enumerated and included in the extraction run.

🏒

Team Sites (SharePoint)

Every Microsoft Team has a backing SharePoint site. ChunkIQ crawls all document libraries on these sites, not just the default Files tab.

Built for large, complex Teams environments

No per-team configuration needed. Add ChunkIQ once and every current and future team is covered.

πŸ”­

Automatic Team Discovery

Enumerates every team in the tenant automatically. New teams added after initial setup are automatically included on the next run.

πŸ“‚

Deep Library Crawl

Goes beyond the default Files tab to discover custom document libraries, wiki content libraries, and any other SharePoint libraries on the team site.

πŸ“Ž

Attachment Extraction

Extracts and processes files embedded inside Word, Excel, and PowerPoint documents found in Teams, tracking lineage back to the parent file and channel.

πŸ—ΊοΈ

Teams-Aware Metadata

Each chunk is tagged with team name, channel name, site URL, library, file path, and modification date β€” enabling fine-grained filtering in search results.

πŸ”„

Incremental Updates

Content hash deduplication means only new or changed files are re-extracted. Large tenants with thousands of files process efficiently on every run.

πŸ”’

Secure by Design

Runs entirely within your Azure tenant. Uses managed identity or service principal with the minimum required permissions.

Search your Teams content instantly

ChunkIQ Teams ships with a Streamlit-powered search portal. Hybrid BM25 + vector + semantic re-ranking returns the most relevant chunks from every team, channel, and document.

πŸ”

Hybrid Search

Combines BM25 keyword scoring with 1,536-dimensional HNSW vector search. Results are re-ranked with Azure AI Search semantic ranking via Reciprocal Rank Fusion.

πŸ—‚οΈ

Source-Aware Results

Every result shows the originating team, channel, file name, and folder path. Click straight through to the source document in Microsoft Teams.

⚑

Zero-Config Portal

The Streamlit app connects directly to your Azure AI Search index using the same connection settings as the pipeline. No additional backend required.

πŸ”   Search across all Teams content…
Q3 Product Roadmap.pptx
πŸ’¬ product-team Β· General Β· Slide 4
"…the new pipeline will support incremental delta sync from SharePoint and OneDrive, reducing ingest time by…"
Engineering Onboarding Guide.docx
πŸ’¬ engineering Β· Private Β· Documents/HR
"…access to ADLS Gen2 is granted via managed identity. No connection strings are stored in code or…"
Sprint 22 Retrospective.pdf
πŸ’¬ dev-team Β· Shared Β· Meeting Notes
"…agreed to migrate the extraction stage to native parsers to remove the Document Intelligence dependency and…"

From Teams to searchable index in 4 steps

Step 01
πŸ”‘

Authenticate

Authenticates via Azure AD with Group.Read.All and Files.Read.All permissions to access all teams and their underlying SharePoint sites.

Step 02
πŸ”

Discover Teams & Sites

Enumerates all teams, resolves their SharePoint team sites, and lists every document library β€” standard, private, and shared channels included.

Step 03
πŸ“„

Extract & Chunk

Dedicated extractors process each file format. Text is cleaned, split into semantic chunks, and enriched with Teams-specific provenance metadata.

Step 04
⚑

Embed & Index

Chunks are embedded with Azure OpenAI and pushed to Azure AI Search for hybrid BM25 + vector + semantic search across all Teams content.

Built on Microsoft Graph + Azure

Teams API
Microsoft Graph β€” /teams endpoint
Auth Scope
Group.Read.All Β· Files.Read.All
Site Resolution
Graph /groups/{id}/sites/root
Storage
Azure Data Lake Storage Gen2
Document Extraction
Native format parsers
Chunking
Hybrid chunker
Embeddings
Azure OpenAI text-embedding-3-small
Search Index
Azure AI Search Β· HNSW Β· BM25 Β· Semantic
Search Portal
Web Portal

Index Teams. Search instantly.

One setup. Every team. Every channel. Every file β€” automatically extracted, indexed, and searchable through the built-in portal.

Create your account View full pipeline β†’