💬

Microsoft 365 Connector + Search Portal

Teams & Search Portal

Automatically extract every file shared across Microsoft Teams channels and team sites — then search across all indexed content through a built-in Streamlit search portal.

Get started free See the pipeline

Auto

Team discovery

Channel types

Hybrid

BM25 + Vector search

100%

Native extraction

Coverage

Every Teams file surface, covered

ChunkIQ discovers all teams in your tenant and indexes files across all channel types automatically.

📢

Standard Channels

Files posted or shared in public channels. Automatically maps the underlying SharePoint team site and crawls the Files tab library.

🔒

Private Channels

Private channels with dedicated SharePoint sites. Each is discovered and crawled separately with appropriate permission scopes.

🌐

Shared Channels

Cross-tenant shared channels and their associated document libraries are enumerated and included in the extraction run.

🏢

Team Sites (SharePoint)

Every Microsoft Team has a backing SharePoint site. ChunkIQ crawls all document libraries on these sites, not just the default Files tab.

Capabilities

Built for large, complex Teams environments

No per-team configuration needed. Add ChunkIQ once and every current and future team is covered.

🔭

Automatic Team Discovery

Enumerates every team in the tenant automatically. New teams added after initial setup are automatically included on the next run.

📂

Deep Library Crawl

Goes beyond the default Files tab to discover custom document libraries, wiki content libraries, and any other SharePoint libraries on the team site.

📎

Attachment Extraction

Extracts and processes files embedded inside Word, Excel, and PowerPoint documents found in Teams, tracking lineage back to the parent file and channel.

🗺️

Teams-Aware Metadata

Each chunk is tagged with team name, channel name, site URL, library, file path, and modification date — enabling fine-grained filtering in search results.

🔄

Incremental Updates

Content hash deduplication means only new or changed files are re-extracted. Large tenants with thousands of files process efficiently on every run.

🔒

Secure by Design

Runs entirely within your Azure tenant. Uses managed identity or service principal with the minimum required permissions.

Built-in Search UI

Search your Teams content instantly

ChunkIQ Teams ships with a Streamlit-powered search portal. Hybrid BM25 + vector + semantic re-ranking returns the most relevant chunks from every team, channel, and document.

🔍

Hybrid Search

Combines BM25 keyword scoring with 1,536-dimensional HNSW vector search. Results are re-ranked with Azure AI Search semantic ranking via Reciprocal Rank Fusion.

🗂️

Source-Aware Results

Every result shows the originating team, channel, file name, and folder path. Click straight through to the source document in Microsoft Teams.

⚡

Zero-Config Portal

The Streamlit app connects directly to your Azure AI Search index using the same connection settings as the pipeline. No additional backend required.

🔍 Search across all Teams content…

Q3 Product Roadmap.pptx

💬 product-team · General · Slide 4

"…the new pipeline will support incremental delta sync from SharePoint and OneDrive, reducing ingest time by…"

Engineering Onboarding Guide.docx

💬 engineering · Private · Documents/HR

"…access to ADLS Gen2 is granted via managed identity. No connection strings are stored in code or…"

Sprint 22 Retrospective.pdf

💬 dev-team · Shared · Meeting Notes

"…agreed to migrate the extraction stage to native parsers to remove the Document Intelligence dependency and…"

How it works

From Teams to searchable index in 4 steps

Step 01

🔑

Authenticate

Authenticates via Azure AD with Group.Read.All and Files.Read.All permissions to access all teams and their underlying SharePoint sites.

Step 02

🔍

Discover Teams & Sites

Enumerates all teams, resolves their SharePoint team sites, and lists every document library — standard, private, and shared channels included.

Step 03

📄

Extract & Chunk

Dedicated extractors process each file format. Text is cleaned, split into semantic chunks, and enriched with Teams-specific provenance metadata.

Step 04

⚡

Embed & Index

Chunks are embedded with Azure OpenAI and pushed to Azure AI Search for hybrid BM25 + vector + semantic search across all Teams content.

Under the hood

Built on Microsoft Graph + Azure

Teams API

Microsoft Graph — /teams endpoint

Auth Scope

Group.Read.All · Files.Read.All

Site Resolution

Graph /groups/{id}/sites/root

Storage

Azure Data Lake Storage Gen2

Document Extraction

Native format parsers

Chunking

Hybrid chunker

Embeddings

Azure OpenAI text-embedding-3-small

Search Index

Azure AI Search · HNSW · BM25 · Semantic

Search Portal

Web Portal