1. Data Mining for Security Applications 2. Overview of Data Mining Security Threats Data Mining for Cyber security applications Intrusion Detection Data Mining for Firewall Policy Management Data Mining for Worm Detection Data Mining for Counter-terrorism Surveillance Advantages Conclusion 3. Data Mining - Extraction of interesting ( non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases [Han and Kamber 2005]. Data mining is used to sort through the tremendous amounts of data stored by automated data collection tools. Extracts rules, regularities, patterns, and constraints from databases. 4. Natural Disasters Human Errors Non - Information related threats Information Related threats Biological, Chemical, Nuclear Threats Critical Infrastructure Threats Threat Types 5. Data mining is being applied to problems such as intrusion detection and auditing. For example, Anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace self-propagating malicious code to its authors. Classificatio n may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations 6. An intrusion can be defined as “any set of actions that attempt to compromise the integrity, confidentiality, or availability of a resource”. Attacks are: Host-based attacks Network-based attacks Intrusion detection systems are split into two groups: Anomaly detection systems Misuse detection systems 7. Data mining can help automate the process of investigating intrusion detection alarms. Data mining on historical audit data and intrusion detection alarms can reduce future false alarms. 8. Build models of normal data Detect any deviation from normal data Flag deviation as suspect Identify new types of intrusions as deviation from normal behavior Misuse detection Label all instances in the data set (“normal” or “intrusion” ) Run learning algorithms over the labeled data to generate classification rules Automatically retrain intrusion detection models on different input data 9. Misuse detection Classification Model Bayesian classifier Decision tree Association rule Support vector machine Learning from rare class 10. Anomaly detection Anomaly Detection Model Association rule Neural network Unsupervised SVM Outlier detection 11. Analysis of Firewall Policy Rules Using Data Mining Technique s Firewall is the de facto core technology of today’s network security First line of defense against external network attacks and threats Firewall controls or governs network access by allowing or denying the incoming or outgoing network traffic according to firewall policy rules. Manual definition of rules often result in anomalies in the policy Detecting and resolving these anomalies manually is a tedious and an error prone task 12. Anomaly detection: Theoretical Framework for the resolution of anomaly A new algorithm will simultaneously detect and resolve any anomaly that is present in the policy rules Traffic Mining: Mine the traffic and detect anomalies 13. To bridge the gap between what is written in the firewall policy rules and what is being observed in the network is to analyze traffic and log of the packets– Network traffic trend may show that some rules are out-dated or not used recently Firewall Policy Rule Firewall Log File Mining Log File Using Frequency Filtering Rule Generalization Generic Rules Identify Decaying & Dominant Rules Edit Firewall Rules 14. What are worms? Self-replicating program; Exploits software vulnerability on a victim; Remotely infects other victims Goals of worm detection Real-time detection Issues Substantial Volume of Identical Traffic, Random Probing Methods for worm detection Count number of sources/destinations; Count number of failed connection attempts Worm Types Email worms, Instant Messaging worms, Internet worms, IRC worms, File-sharing Networks worms 15. Training data Feature extraction Clean or Infected ? Outgoing Emails Classifier Machine Learning Test data The Model Task: given some training instances of both “normal” and “viral” emails, induce a hypothesis to detect “viral” emails. 16.   17. Gather data from multiple sources Information on terrorist attacks: who, what, where, when, how Personal and business data: place of birth, ethnic origin, religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . . Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . . Integrate the data, build warehouses and federations Develop profiles of terrorists, activities/threats Mine the data to extract patterns of potential terrorists and predict future activities and targets Find the “needle in the haystack” - suspicious needles? Data integrity is important 18. Integrate data sources Clean/ modify data sources Build Profiles of Terrorists and Activities Examine results/ Prune results Report final results Data sources with information about terrorists and terrorist activities Mine the data 19. Nature of data Data arriving from sensors and other devices Continuous data streams Breaking news, video releases, satellite images Some critical data may also reside in caches Rapidly sift through the data and discard unwanted data for later use and analysis (non-real-time data mining) Data mining techniques need to meet timing constraints Quality of service (QoS) tradeoffs among timeliness, precision and accuracy Presentation of results, visualization, real-time alerts and triggers 20. Integrate data sources in real - time Build real - time models Examine Results in Real - time Report final results Data sources with information about terrorists and terrorist activities Mine the data Rapidly sift through data and discard irrelevant data 21.   22. Huge amounts of surveillance and video data available in the security domain Analysis is being done off-line usually using “Human Eyes” Need for tools to aid human analyst ( pointing out areas in video where unusual activity occurs) 23. Event Representation Estimate distribution of pixel intensity change Event Comparison Contrast the event representation of different video sequences to determine if they contain similar semantic event content. Event Detection Using manually labeled training video sequences to classify unlabeled video sequences 24. Law enforcement : Data mining can aid law enforcers in identifying criminal suspects as well as apprehending these criminals by examining trends in location, crime type, habit, and other patterns of behaviors. Researchers: Data mining can assist researchers by speeding up their data analyzing process; thus, allowing them more time to work on other projects.    25. The various data mining techniques that have been proposed towards the enhancement of security of different application. The ways in which data mining has been known to aid the process of Intrusion Detection,firewall,worm detection counter-terrorism and the ways in which the various techniques have been applied and evaluated. 26. B. Thuraisingham. Managing threats to web databases and cyber systems: Issues, solutions and challenges. In V. Kumar et al, editor, Cyber Security: Threats and Countermeasures. Kluwer B. Thuraisingham. Data mining, national security, privacy and civil liberties. SIGKDD Explorations, January 2003 F. Bolz et al. The Counterterrorism Handbook: Tactics, Procedures, and Techniques. CRC Press, 2001. http://dmoz.org/Computers/Security/Intrusion_Detection_Systems / 27. Thank you

data mining for security application

  • Published on

  • View

  • Download


1. Data Mining for Security Applications 2. Overview of Data Mining Security Threats Data Mining for Cyber security applications Intrusion Detection Data Mining for Firewall…