A SIEM (Security Information and Event Management) platform is the log backbone of a SOC. It takes logs from every source in the environment, normalizes them into a common shape, and makes them searchable and correlatable. Without a SIEM, analysts are tailing twelve different consoles. With a well-operated SIEM, they are running one query and seeing the whole picture. This module covers what goes into a SIEM, how logs get parsed and normalized, and the pragmatic choices Indian teams make when selecting and running one.
What a SIEM actually does
Strip the marketing and a SIEM has four jobs:
- Ingest β accept logs from heterogeneous sources (endpoint, network, cloud, application, identity)
- Parse & normalize β turn raw log lines into structured events with consistent field names
- Store & search β keep the events queryable for days to years depending on retention policy
- Detect β run correlation rules continuously to fire alerts when conditions are met
Everything else β dashboards, reports, user entity behaviour analytics (UEBA), SOAR integrations β is layered on top.
Log sources that actually matter
You cannot ingest everything. Storage and licensing kill you if you try. Prioritize by investigative value:
- Identity β Azure AD / Okta / Google Workspace sign-in logs. High signal-to-noise for account compromise, impossible travel, MFA fatigue, privilege changes
- EDR β process creation, network connections, file writes on endpoints. The richest telemetry you will ingest, and most SIEMs charge accordingly
- Cloud platform β AWS CloudTrail, Azure Activity Log, GCP Audit Logs. Essential for any cloud-native company
- Network β firewall deny logs, proxy logs, DNS query logs, Zeek/Suricata if you run IDS
- Authentication β VPN, SSH, RDP session starts and failures
- Application β web server access logs (filtered), application auth events, security-relevant actions like role changes, data exports
- Email gateway β message metadata, verdicts, URL clicks, attachment detonations
Things to deprioritize: debug logs, INFO-level application noise, most Windows event IDs outside the well-known security subset, NetFlow in packet detail. You can always add later; you cannot un-pay for what you ingested.
The log pipeline
SOURCE βββΊ COLLECTOR βββΊ PIPELINE βββΊ SIEM βββΊ STORAGE
(host) (agent/ (parse, (index, (hot/warm/
syslog/ enrich, search, cold tiers)
API pull) drop noise, correlate)
route)
A common mistake is ingesting raw logs directly into the SIEM without a pipeline in between. The pipeline (Cribl, Logstash, Vector, Fluent Bit) is where you:
Continue reading with Basic tier (βΉ499/month)
You've read 26% of this module. Unlock the remaining deep-dive, quiz, and every other Intermediate module.