DLP.All, Audit.Exchange, Audit.SharePoint, Audit.General, Service Health, Message Trace, Entra ID metadata. This is the audit layer — not the complete investigation layer.DataSecurityEvents as Preview — requires IRM opt-in.Layer 1 — Raw Microsoft Telemetry Keep raw JSON. Do not over-transform too early.
| Index | Source |
|---|---|
| idx_ms_o365_audit | Office 365 Management Activity API — Audit.Exchange, Audit.SharePoint, Audit.General |
| idx_ms_o365_dlp | DLP.All feed |
| idx_ms_o365_retention | UAL retention policy, disposition, record declaration events (RecordType 50) |
| idx_ms_purview_insider_risk | UAL InsiderRiskManagement events (RecordTypes 306, 307, 308) |
| idx_ms_defender_incidents | Defender XDR incidents |
| idx_ms_defender_alerts | Defender XDR alerts |
| idx_ms_defender_hunting | Scheduled Advanced Hunting query outputs |
| idx_ms_service_health | O365 service health / communications |
Layer 2 — Normalized Purview Control Facts idx_purview_control_facts · purview_summary
Each row represents one control-relevant fact. idx_purview_control_facts is the unified normalized fact store for cross-family correlation and dashboard queries. purview_summary is the operational summary index populated by daily collect searches (see SPL Query Pack §8) — it feeds the KPI marts. Both serve Layer 2; use purview_summary as the write target for scheduled searches and idx_purview_control_facts as the curated read target for dashboards.
| Field | Meaning |
|---|---|
event_time | When the event happened |
ingest_time | When Splunk received it |
source_plane | UAL, DefenderIncident, DefenderAlert, AdvancedHunting |
workload | Exchange, SharePoint, OneDrive, Teams, Endpoint, Browser, etc. |
policy_name | DLP / label / IRM policy |
rule_name | DLP rule |
rule_action | Audit, notify, warn, block, restrict, override, etc. |
enforcement_mode | Audit, warn, block, allow, etc. |
user_upn | Actor |
recipient_domain | External destination where available |
file_name / file_extension | Content object and type |
sensitivity_label | Label at time of event |
sit_names / sit_count | Sensitive info types matched + count |
confidence | SIT confidence where available |
alert_id / incident_id | Defender / Purview alert and incident linkage |
severity | Alert severity |
status | Alert / incident status |
classification | TP / FP / benign / unknown |
ticket_id | ServiceNow / Jira etc. |
maturity_status | Blank / Partial / Live |
Layer 3 — KPI Marts Prevent dashboards from repeatedly parsing nested JSON
| Mart | Audience |
|---|---|
| kpi_purview_health_daily | Engineering |
| kpi_dlp_effectiveness_daily | Engineering / SOC |
| kpi_investigation_operations_daily | Investigations / SOC |
| kpi_retention_lifecycle_daily | Records Management / Compliance |
| kpi_insider_risk_daily | Investigations — privacy-controlled; RBAC required |
| kpi_exec_control_score_monthly | Executive |
| kpi_audit_evidence_monthly | Audit / Compliance |
Lookup Tables
label_id
label_name
reporting_label_family
protection_level
encryption_expected
external_sharing_allowed
sit_name
sit_family
regulated_data_type
executive_category
severity_modifier
policy_name
policy_owner
control_objective
workload_scope
deployment_status
expected_action
executive_category
kpi_name
data_source
maturity_status
owner
known_gap
remediation_plan
Engineering KPIs
| KPI | Source | Maturity | Notes |
|---|---|---|---|
| DLP event ingestion freshness | Splunk _indextime vs event time | LIVE | Proves pipeline health |
| UAL subscription / content availability | O365 Management Activity API | LIVE | Detects broken audit feed |
| DLP policy match volume | DLP.All, Advanced Hunting | LIVE | Basic activity control |
| Rule action distribution | UAL + AH enrichment | PARTIAL | Needs parsing / enrichment |
| SIT confidence distribution | AH / Purview event details | PARTIAL | May be inconsistent by workload |
| Label usage by workload | UAL + label events | PARTIAL | Requires label taxonomy mapping |
| Auto-labeling activity trend | UAL / Purview classification events | PARTIAL | Stronger with Activity Explorer export/API |
| SHIR / scanner health | Purview governance / Data Map telemetry | BLANK | Separate source — do not fake it |
| OCR pipeline status | Control register | BLANK | Mark "not deployed" until telemetry exists |
Investigation KPIs
| KPI | Source | Maturity | Notes |
|---|---|---|---|
| Alert volume by severity | Defender XDR alerts | LIVE | Clean executive / SOC metric |
| Incident volume by status | Defender XDR incidents | LIVE | Operational backlog |
| Triage queue depth | Defender incidents (Active + In Progress) | LIVE | Real-time operational view |
| Aging by severity | Defender incidents | LIVE | Critical for audit defensibility |
| Repeat offender users / entities | UAL + Defender evidence | LIVE | Strong SOC value |
| Mean time to triage (MTTA) | Defender incident status + assignment | PARTIAL | Requires lifecycle event capture |
| Mean time to resolve (MTTR) | Defender incident resolved timestamp | PARTIAL | Better via incident API |
| False-positive rate | Defender classification | PARTIAL | Requires analysts to classify |
| Top exfiltration vectors | DLP events + workload / action | PARTIAL | Needs normalization |
| Ticket creation latency | Defender alert time + ticket time | PARTIAL | Requires ITSM integration |
Retention / Data Lifecycle KPIs
| KPI | Source | Maturity | Notes |
|---|---|---|---|
| Retention policies deployed by workload | UAL retention policy events | BLANK | Proves scope coverage |
| Retention labels applied by label/workload | UAL label application events | BLANK | Proves label adoption |
| Records declared by workload | UAL record declaration events | BLANK | Records management activity |
| Items pending disposition review | UAL disposition events | BLANK | Disposition queue backlog |
| Average disposition review age | UAL disposition timestamps | BLANK | SLA compliance indicator |
| Disposition approvals / rejections | UAL disposition decision events | BLANK | Control execution evidence |
| Items eligible for deletion | UAL deletion eligibility events | BLANK | Lifecycle maturity signal |
| Items deleted after retention | UAL delete execution events | BLANK | Disposal execution proof |
| Items relabeled during disposition | UAL relabel / RelabelItem events | BLANK | Governance correction evidence |
| Workloads without retention coverage | Policy scope vs workload inventory | BLANK | Executive risk gap |
Insider Risk Management KPIs
| KPI | Source | Maturity | Notes |
|---|---|---|---|
| IRM alerts by policy | UAL RecordType 307 (InsiderRiskManagementAlert) | BLANK | Policy activity baseline |
| IRM alerts by risk level | UAL RecordType 307 | BLANK | Risk distribution |
| IRM cases opened | UAL RecordType 308 (InsiderRiskManagementCase) | BLANK | Investigation demand |
| IRM cases closed | UAL RecordType 308 | BLANK | Throughput |
| IRM case aging | UAL RecordType 308 timestamps | BLANK | Backlog risk indicator |
| IRM false-positive rate | UAL RecordType 307 + case resolution (308) | BLANK | Tuning quality indicator |
| IRM risk activity volume | UAL RecordType 306 (InsiderRiskManagement — activity events) | BLANK | Exfiltration signals, policy matches, sequences |
| IRM escalation rate | UAL RecordTypes 307 → 308 correlation | BLANK | Alert → case promotion rate |
| IRM policy coverage | Policy inventory | BLANK | Deployment maturity |
| IRM alert-to-case conversion rate | RecordTypes 307 + 308 correlation | BLANK | Investigation selectivity |
Executive KPIs
| KPI | Source | Maturity | Notes |
|---|---|---|---|
| Program coverage % | Control inventory + policy scope | PARTIAL | Needs authoritative scope inventory |
| Protected locations | Policy scope + workload coverage | PARTIAL | Do not infer from events alone |
| Control Health composite score | KPI mart | PARTIAL | Good exec rollup — see formula below |
| Risk exposure trend, 90 days | Alerts + DLP events | LIVE once retained | Strong leadership metric |
| NPI / PCI protected vs exposed | SIT + action outcome | PARTIAL | Needs SIT taxonomy mapping |
| Block / allow-with-override ratio | DLP enforcement / action | PARTIAL | Key effectiveness metric |
| Member-data incidents avoided | Blocked / restricted events | PARTIAL | Define carefully — avoid inflated claims |
| Retention deployment status | UAL / idx_ms_o365_retention | BLANK | Now sourced via retention fact family |
| IRM program active (T/F) | UAL RecordTypes 306/307/308 | BLANK | Confirms IRM is deployed and generating signals |
Control Health Score
| Component | Weight |
|---|---|
| Ingestion freshness | 20% |
| DLP policy telemetry present | 20% |
| Alert pipeline active | 20% |
| Policy / rule / action parse success | 15% |
| Incident lifecycle completeness | 15% |
| KPI maturity completeness | 10% |
Effectiveness Score
| Component | Weight |
|---|---|
| Sensitive events protected by block / restrict / warn | 25% |
| High-risk events reduced over 90 days | 20% |
| False-positive rate reduced | 20% |
| Mean time to triage improved | 15% |
| Repeat offenders reduced | 10% |
| Override rate controlled | 10% |
- UAL ingestion freshness
- DLP event count by day
- Defender alert count by day
- Defender incident count by day
- API gaps / zero-event days
- Duplicate rate
- Parse success rate
- Events missing policy / rule / action
- Splunk source / sourcetype / index status
- DLP events by workload
- DLP events by policy and rule
- Actions: audit / warn / block / override
- Block / allow-with-override ratio
- SIT families: NPI / PCI / PII
- SIT confidence distribution
- Top external domains
- Top users by sensitive activity
- Top files by repeat policy hits
- Label + SIT mismatch report
- Active incidents by severity
- Aging incidents by severity
- MTTA / MTTR
- False-positive rate
- True-positive rate
- Unassigned incidents
- Reopened incidents
- Top policies generating FPs
- High-volume users
- Multi-incident users / files / devices
- Control Health composite score
- Effectiveness composite score
- 90-day exposure trend
- Protected vs exposed sensitive activity
- Block / restrict / warn / override ratio
- Member-data protection trend
- Program coverage by workload
- Top 5 control gaps
- KPI maturity: Blank / Partial / Live
- Retention policy coverage by workload
- Retention label activity by label / workload
- Records declared — standard & regulatory
- Disposition queue — items pending review
- Disposition approvals & rejections
- Retention extensions
- Items relabeled at disposition
- Deletion-eligible content
- Items deleted after retention
- Lifecycle gaps by workload / site / mailbox
- IRM alerts by policy (aggregate)
- IRM alerts by risk level
- IRM cases by status
- Case aging distribution
- Alert-to-case conversion rate
- False-positive rate trend
- Scoped users count trend
- Top triggering activity types
- Policy coverage & deployment maturity
- Privacy-safe executive summary panel
User-identifiable fields restricted by Splunk RBAC. Executive view shows aggregate counts and risk bands only.
- Control objective
- Data source
- Evidence available
- Last successful event
- Last dashboard refresh
- Owner
- Gaps + remediation plan
- Screenshot / export link
- Monthly evidence package status
Reports without data are still evidence that the control framework exists, provided the report clearly shows the expected source, maturity state, gap, owner, and remediation path.
- Enable / verify Unified Audit Log
- Configure Splunk Add-on for Microsoft Office 365
- Ingest
DLP.All,Audit.Exchange,Audit.SharePoint,Audit.General - Configure Splunk Add-on for Microsoft Security
- Ingest Defender XDR incidents and alerts
- Create ingestion health dashboard
- Build Blank / Partial / Live maturity tags
Success: Splunk receives DLP + audit events daily. Defender incidents/alerts appear. Ingest freshness is measured. Data gaps are explicit and visible.
- Extract policy / rule / action fields from nested JSON
- Normalize workload names
- Normalize action outcomes: audit, notify, warn, allow, allow-with-override, block, restrict, encrypt/quarantine
- Map SIT names → reporting families: NPI, PCI, PII, Financial, Credentials, Legal/privileged, Custom
- Map labels → 4-label taxonomy: Public, Internal, Confidential, Restricted
Success: Top DLP policy/rule/action reports reliable. Executives see "protected vs exposed." Engineers see FP clusters. Investigators pivot from dashboard to incident.
- Scheduled AH queries: DLP rule matches, alert evidence, high-risk users, file/device pivots, label + SIT + policy combos
- Summary export into Splunk
- Correlation: UAL event ID ↔ alert ID ↔ incident ID ↔ file/user/device ↔ ticket ID
- Incremental pulls — respect 30-day window, 100K row limit, rate limits
Success: Analyst dashboards stop being log viewers. Policy effectiveness and triage status visible. FP rate and MTTR become measurable.
- Monthly executive dashboard
- Audit evidence dashboard
- Control Health composite score
- 90-day exposure trend
- Block / override / allow trend
- Top risky workflows: external email, anonymous sharing, unmanaged device download, removable media, cloud upload, Teams oversharing, browser upload to unapproved domains
Success: Executives assess risk in <5 minutes. Auditors trace every report to source + control objective. Engineers see broken controls. Investigators see what requires action today.
When to Consider Alternatives
Consider Microsoft Sentinel or Defender portal-native reporting if: Splunk cannot ingest Defender XDR incidents/alerts properly; the team needs native KQL over Microsoft security tables; Advanced Hunting data is easier to operationalize in Microsoft-native tooling; leadership accepts Power BI/Sentinel workbooks instead of Splunk dashboards; or cost/data-volume concerns make full Splunk indexing unattractive.
For this project, the best answer is: Keep Splunk, but feed it better data. Use UAL for audit evidence, Defender XDR for alert lifecycle, and Advanced Hunting for enrichment.
| Friction Point | What Actually Happens | Mitigation |
|---|---|---|
| UAL event schema is deeply nested JSON | DLP events embed PolicyDetails[].Rules[].Actions[] as arrays-within-arrays. The Splunk O365 Add-on ingests raw JSON but does not auto-extract nested arrays into usable fields. Most engineers hit this in the first week and spend days writing spath + mvexpand chains. |
Use spath + mvexpand (see SPL Query Pack §2). Write normalized results into a summary index daily. Never run nested JSON parsing in production dashboard queries — pre-compute. Keep raw events in a separate cold/warm tier. |
| O365 Management Activity API: 12-hour event delay | Audit events are not real-time. Microsoft's documented SLA for most audit records is up to 24 hours; typical observed latency is 1–12 hours for DLP events. SharePoint/OneDrive events have sometimes shown 24+ hour delays in documented incidents. Engineers expecting near-real-time SIEM data are surprised. | Do not use UAL for real-time alerting. Use Defender XDR incidents/alerts for operational alerting (they're faster). UAL is your audit evidence and compliance layer, not your SOC detection feed. Set dashboard refresh expectations accordingly. |
| Advanced Hunting API: 10K row limit per response | The API returns a maximum of 10,000 rows per query execution. The portal UI also caps results at 10,000 rows. There is no native streaming endpoint. Queries that return more than 10K results are silently truncated unless you implement pagination or time-banded incremental pulls. | Implement time-banded incremental pulls (e.g., 6-hour windows). Use Timestamp > {last_pull} in your KQL to avoid full re-scan. Schedule as a Splunk modular input or external Python script. Store the last successful pull timestamp in a KV Store or lookup. |
| Defender XDR SIEM connector deprecation / migration | Microsoft deprecated the legacy Defender for Endpoint SIEM API (siem.windows.com) in September 2024. The replacement is the Microsoft Defender XDR Streaming API (via Event Hubs) or the Microsoft Graph Security API. Teams using older Splunk add-on versions or legacy connector configs may find incidents/alerts silently stop flowing. | Use Splunk Add-on for Microsoft Security (not the legacy Defender for Endpoint add-on). Verify it is configured against the Defender XDR incidents/alerts API endpoints, not the deprecated SIEM API. Check Splunkbase for add-on version; the current certified version supports the Graph Security API. |
| Event Hubs as a required intermediary | Microsoft's recommended path for streaming Defender XDR and Purview signals at scale is via Azure Event Hubs. This adds an Azure infrastructure dependency (Event Hub namespace, consumer groups, connection strings, throughput units) that a pure Splunk shop may not have provisioned. The Splunk Add-on for Microsoft Cloud Services handles Event Hubs ingestion but requires separate configuration. | For moderate volume (under ~500K events/day), the polling-based O365 Add-on and Security Add-on are sufficient and simpler to operate. Switch to Event Hubs streaming only when polling lag becomes operationally unacceptable or when volume exceeds what the Management Activity API can serve within its rate limits. |
| O365 Management Activity API rate limits | The API enforces per-publisher, per-tenant rate limits. Heavy polling during high-event periods (e.g., large DLP scan sweeps, major incidents) can result in HTTP 429 throttling. The Splunk Add-on handles retries but silently queues content blobs — engineers often don't notice gaps until they check ingestion freshness metrics. | Monitor the idx_ms_service_health index and the Splunk Add-on internal logs (index=_internal sourcetype=splunk_ta_o365). Build ingestion gap detection into Dashboard 1 (Pipeline Health). Use separate content type subscriptions (DLP.All vs Audit.Exchange vs Audit.General) so a throttle on one feed doesn't block all audit data. |
| DLP event de-duplication | The Management Activity API content blob model means that a single DLP event can appear in multiple content blobs across polling windows, or be re-delivered after a transient API error. Engineers building DLP dashboards without de-duplication will over-count events, inflating policy match counts and exec metrics. | De-duplicate on stable event identifiers: Id (the audit record GUID in UAL) + CreationTime + UserId + ObjectId. Use dedup in dashboard queries or, better, de-duplicate at collect time in your summary index build (see SPL §8). Include event_key = md5(...) in your fact rows. |
| Purview Insider Risk data is not in UAL by default | RecordTypes 306, 307, and 308 (IRM) are not present in the standard UAL subscription. IRM audit records require: (1) IRM to be configured and policies active, (2) the tenant admin to have explicitly opted into IRM audit logging, and (3) the O365 Management Activity subscription to be configured for Audit.General which is where IRM records appear. Some tenants never enable this. |
Validate IRM audit record presence before building the IRM dashboard. Run validation query (SPL §1.2) and confirm RecordTypes 306/307/308 are present. If absent, IRM may not be active, IRM audit logging may be disabled, or Audit.General subscription may be missing. Escalate to the Microsoft Purview admin to verify IRM policy state and audit configuration. |
| DataSecurityEvents (Advanced Hunting) requires IRM opt-in | The DataSecurityEvents table in Advanced Hunting is in Preview and requires a separate IRM opt-in. Tenants that have not completed this opt-in will see the table return zero results or not appear at all in the AH schema. This is often discovered after hours of SPL debugging. |
Treat DataSecurityEvents as optional enrichment, not a primary data source. Mark any KPIs depending on it as BLANK until opt-in is confirmed and results are validated. Document this as a known gap in the KPI maturity matrix. |
| Sensitivity label events are sparse in UAL | UAL captures label change events (apply, change, remove) but does not provide a snapshot of how many files are currently labeled. There is no "current label inventory" stream in UAL. Engineers expecting a labeled-file-count metric from Splunk alone will be unable to build it from UAL events. | Use UAL label events for activity trending, change detection, and downgrade alerting. For labeled-file inventory, use the Microsoft Graph API (Content Discovery), Purview Content Explorer export, or Activity Explorer API. Import these as a scheduled lookup or reference dataset in Splunk — do not try to reconstruct inventory from event counts. |
| Microsoft schema changes without notice | Microsoft periodically renames or restructures fields in UAL JSON payloads, Defender API responses, and Advanced Hunting table schemas — often without versioned change notifications. Teams have experienced: DLP policy fields moving inside nested arrays, RecordType values being added mid-cycle, AH column renames, and Defender API response envelope changes breaking ingestion. | Never hard-code field names in production SPL without defensive coalesce() fallbacks. Subscribe to the Microsoft 365 Message Center and the Defender XDR changelog. Build schema validation into Dashboard 1 (Pipeline Health) — alert when expected fields have a null rate above 5%. Maintain a field validation saved search that runs weekly. |
| Microsoft favors Sentinel for Purview integration | Microsoft's native Purview + Defender integration story is built for Sentinel / Log Analytics. The Purview Audit connector, the Defender XDR connector, and the Microsoft 365 Defender data connector all have first-class Sentinel support. SPL equivalents require more engineering effort, and some data surfaces (e.g., Purview Communication Compliance, certain Defender Identity signals) have no published Splunk integration path. | Accept Sentinel as a complementary system for native Microsoft-to-Microsoft signal paths. Feed curated, normalized outputs from Sentinel into Splunk via Sentinel's SIEM forwarding or Event Hub bridge, rather than trying to replicate every Microsoft connector in Splunk. See the Recommendation page for the hybrid model. |
Use with the Splunk / Microsoft engineering team. Covers the full architecture, field dictionary, lookup schemas, dashboard specs, KPI maturity definitions, and acceptance criteria.
You are a Microsoft Purview, Microsoft Defender XDR, and Splunk engineering team building enterprise-class, audit-defensible reporting for Purview DLP, sensitivity labeling, retention/control health, and investigation operations. Project objective: Build Splunk dashboards and KPI marts across two reporting axes: 1. Health — Is the control operating? 2. Effectiveness — Is the control reducing risk? Primary source systems: - Microsoft Purview Unified Audit Log / Office 365 Management Activity API - Office 365 Management Activity API content types: - DLP.All - Audit.Exchange - Audit.SharePoint - Audit.General - Audit.AzureActiveDirectory where useful - Microsoft Defender XDR incidents and alerts - Microsoft Defender Advanced Hunting API - Optional enrichment from ServiceNow/Jira ticketing, HR/user metadata, and label taxonomy reference tables Required Splunk add-ons/connectors: - Splunk Add-on for Microsoft Office 365 - Splunk Add-on for Microsoft Security - Optional: Splunk Add-on for Microsoft Cloud Services if Event Hubs ingestion is used Design principle: Do not expect the Unified Audit Log to contain full investigation context. Treat UAL/O365 Management Activity as the audit evidence layer. Treat Defender XDR incidents/alerts and Advanced Hunting exports as the investigation and enrichment layer. Splunk is the reporting, normalization, correlation, and executive KPI layer. Required raw indexes: - idx_ms_o365_audit - idx_ms_o365_dlp - idx_ms_o365_retention (UAL retention, disposition, records management events) - idx_ms_purview_insider_risk (UAL RecordTypes 306, 307, 308) - idx_ms_defender_incidents - idx_ms_defender_alerts - idx_ms_defender_hunting - idx_ms_service_health Normalized fact families and sourcetypes: - purview:dlp:fact → purview_dlp_fact - purview:label:fact → purview_label_fact - purview:retention:fact → purview_retention_lifecycle_fact - purview:insider_risk:fact → purview_insider_risk_fact - defender:incident:fact → defender_incident_fact - defender:alert:fact → defender_alert_fact - purview:control:fact → purview_control_fact (unified) Required curated indexes or summary indexes: - idx_purview_control_facts - kpi_purview_health_daily - kpi_dlp_effectiveness_daily - kpi_investigation_operations_daily - kpi_exec_control_score_monthly - kpi_audit_evidence_monthly - kpi_retention_lifecycle_daily - kpi_insider_risk_daily (privacy-controlled; RBAC required before enabling) Normalize the following fields into idx_purview_control_facts: - event_time, ingest_time, source_plane, workload, operation - policy_name, rule_name, rule_action, enforcement_mode - user_upn, user_department, user_title - recipient, recipient_domain, external_internal_flag - file_name, file_extension, file_path, site_url - device_name, device_id, ip_address - sensitivity_label, sensitivity_label_id - sit_names, sit_family, sit_count, confidence - alert_id, incident_id, severity, status, classification, determination, assigned_to - ticket_id, maturity_status Create lookup tables: 1. label_taxonomy_lookup label_id | label_name | reporting_label_family | protection_level | encryption_expected | external_sharing_allowed 2. sit_family_lookup sit_name | sit_family | regulated_data_type | executive_category | severity_modifier 3. dlp_policy_lookup policy_name | policy_owner | control_objective | workload_scope | deployment_status | expected_action | executive_category 4. kpi_maturity_lookup kpi_name | data_source | maturity_status | owner | known_gap | remediation_plan Build dashboards: Dashboard 1: Purview Pipeline Health (Audience: Engineering) - UAL ingestion freshness, DLP.All ingestion freshness - Defender incident and alert ingestion freshness - Service health events - Event volume by source plane - Parse success rate, missing policy/rule/action rate - API or connector error count, zero-event days by feed Dashboard 2: DLP Control Effectiveness (Audience: Engineering / Security / Compliance) - DLP events by workload, policy, rule - Action distribution: audit, notify, warn, block, allow, override, restrict - Block / allow-with-override ratio - SIT family distribution: NPI, PCI, PII, financial, credentials, legal, custom - SIT confidence distribution - Label + SIT mismatch report - Top external domains, top risky users, top risky files - 90-day risk exposure trend Dashboard 3: Investigation Operations (Audience: SOC / Investigations) - Incidents by severity and status - Alerts by severity - Active queue depth, aging by severity - Mean time to acknowledge / triage (MTTA) - Mean time to resolve (MTTR) - False-positive rate and true-positive rate - Unassigned and reopened incidents - Top policies producing false positives - Top entities appearing across multiple incidents Dashboard 4: Executive Control Scorecard (Audience: Leadership) - Control Health composite score and Effectiveness composite score - Program coverage percentage - Protected vs exposed sensitive activity - Block/restrict/warn/override trend - Member-data protection trend - Top 5 control gaps - KPI maturity: Blank, Partial, Live - 90-day risk trend, month-over-month improvement Dashboard 5: Audit Evidence (Audience: Audit / Compliance) - Control objective, evidence source, dashboard name - Current maturity state, last successful event, last refresh - Owner, known gap, remediation plan - Export / screenshot evidence status Define KPI maturity states: - Blank: dashboard/control exists, but source data is not yet available or connected. - Partial: some data exists but coverage, parsing, or enrichment is incomplete. - Live: data is connected, normalized, refreshed, and report-ready. Composite Health Score formula: 0.20 * ingestion_freshness_score + 0.20 * dlp_event_presence_score + 0.20 * alert_pipeline_score + 0.15 * parsing_quality_score + 0.15 * incident_lifecycle_score + 0.10 * kpi_maturity_score Composite Effectiveness Score formula: 0.25 * protected_events_score + 0.20 * high_risk_reduction_score + 0.20 * fp_reduction_score + 0.15 * mtta_improvement_score + 0.10 * repeat_offender_score + 0.10 * override_rate_score Advanced Hunting enrichment requirements: Create scheduled Advanced Hunting queries for: - DLP rule matches by policy/rule/action - AlertInfo joined to AlertEvidence - User/file/device/entity evidence extraction - High-risk user and repeat-entity detection - Label + SIT + DLP policy combinations - Endpoint/removable media/browser upload events where available - DataSecurityEvents where available and licensed/opted in (NOTE: Preview — requires IRM opt-in) Engineering constraints: - Advanced Hunting API: 30-day lookback, up to 10,000 rows per API response (paginate for larger result sets), rate-limited, 3-minute query timeout. Use incremental scheduled pulls. Note: the portal UI also caps at 10K rows per query run. - Preserve raw events before normalization. - Expect duplicate O365 Management Activity events — deduplicate using stable event identifiers: workload, operation, user, object, event time, source record ID. - Do not inflate "incidents avoided." Define as blocked/restricted/warned sensitive events and label clearly as a proxy metric. - Do not mark a KPI Live unless source, refresh, parsing, owner, and dashboard validation are complete. Deliverables: 1. Data source inventory 2. Splunk ingestion map 3. Normalized field dictionary 4. Lookup table schemas 5. KPI maturity matrix 6. Dashboard wireframes 7. Initial SPL searches 8. Advanced Hunting KQL query pack 9. Audit evidence register 10. Gap / remediation log Additional required domains — Data Retention / Data Lifecycle Management: Ingest and normalize Microsoft Purview audit activities related to: - retention policy configuration and publication - retention label application, change, and removal - record declaration (standard and regulatory) - disposition review: pending, approved, rejected, extended - item relabeling during disposition (RelabelItem) - retention extension (ExtendRetention) - deletion eligibility and deletion execution - retention exceptions and hold status - lifecycle workload coverage Normalize into purview_retention_lifecycle_fact: - retention_policy_id, retention_policy_name - retention_label_id, retention_label_name, retention_label_action - retention_action, retention_duration, retention_trigger - retention_start_date, retention_expiration_date - record_status, is_record, is_regulatory_record - disposition_review_status, disposition_stage, disposition_reviewer - disposition_decision, disposition_decision_time, disposition_comments - delete_action, delete_eligibility_time, delete_execution_time - retention_exception_reason, retention_hold_status - lifecycle_workload, lifecycle_location Retention event discovery SPL (run before building parsers): index=o365 sourcetype="o365:management:activity" (Operation="*Retention*" OR Operation="*Disposition*" OR Operation="*Record*" OR Operation="*Label*" OR Operation="RelabelItem" OR Operation="ExtendRetention") | stats count by Operation Workload RecordType | sort -count Additional required domains — Insider Risk Management: Ingest and normalize Microsoft Purview Insider Risk Management audit records. Management Activity API record types: - 306 InsiderRiskManagement (individual risk activity events — exfiltration signals, sequence triggers, policy matches) - 307 InsiderRiskManagementAlert (alert-level records — AlertId, PolicyName, Severity, AlertStatus) - 308 InsiderRiskManagementCase (case-level records — CaseId, CaseName, CaseStatus, Severity) Normalize into purview_insider_risk_fact: - irm_case_id, irm_case_name, irm_alert_id - irm_policy_id, irm_policy_name, irm_policy_template - irm_risk_score, irm_risk_level - irm_activity_type, irm_activity_time, irm_triggering_event - irm_user_upn, irm_user_department, irm_user_role (privacy-controlled) - irm_scoped_user_status - irm_case_status, irm_alert_status - irm_assigned_to, irm_review_status - irm_resolution, irm_false_positive - irm_notes_present, irm_privacy_redaction_state - irm_escalated_to_investigation - irm_created_time, irm_updated_time IRM privacy requirement: Apply Splunk RBAC before enabling user-level IRM fields in any dashboard. Executive and engineering dashboards show aggregate counts, trends, risk bands, and status only. User-identifiable IRM details are restricted to approved investigation roles. Updated purview_control_fact unified model must include: - DLP detection - Sensitivity labeling - Retention and lifecycle activity - Records management activity - Disposition review - Insider Risk alerts and cases - Defender incidents and alerts - Workflow state and ticket linkage - KPI maturity tracking Acceptance criteria: - Splunk receives UAL/O365 DLP data. - Splunk receives Defender XDR incidents and alerts. - Retention/lifecycle audit events are discoverable in Splunk via idx_ms_o365_retention. - Retention label and disposition activities are parsed where present in UAL. - Retention & Lifecycle dashboard (Dashboard 6) exists. - Insider Risk record types 306, 307, and 308 are searched and validated via idx_ms_purview_insider_risk. - Insider Risk Management dashboard (Dashboard 7) exists. - Insider Risk reporting is privacy-safe by default; RBAC applied. - Retention and Insider Risk KPIs are tagged Blank, Partial, or Live. - At least one dashboard exists for each audience: Engineering, Investigations, Executive, Audit, Records Management. - Every KPI is tagged Blank, Partial, or Live. - Every KPI has a source system, owner, refresh cadence, and known limitation. - Dashboards distinguish Health from Effectiveness. - Executives can view risk posture in under five minutes. - Auditors can trace each report back to source telemetry and control objective.
- Office 365 Management Activity API reference — Microsoft Learn
- Splunk Add-on for Microsoft Office 365 — splunk.github.io
- Integrate SIEM tools with Microsoft Defender XDR — Microsoft Learn
- Get started with DLP alerts (Defender XDR as preferred portal) — Microsoft Learn
- Microsoft Defender XDR Advanced Hunting API — Microsoft Learn
- DataSecurityEvents table in Advanced Hunting (Preview) — Microsoft Learn
- Microsoft Purview Data Lifecycle Management — Microsoft Security
- Office 365 Management Activity API schema (incl. IRM record types 306–308) — Microsoft Learn
- Audit log activities (retention, disposition, records management) — Microsoft Learn