HEAD
CourseObjectives
Thecoursewilllay downthebasicconceptsand techniquesoflinearalgebraandcalculusneededforsubsequent study.
Thecoursewillexploretheconceptsinitiallythroughcomputationalexperimentsandt hentrytounderstandthe conceptsand theory behind it.
Thecoursewillprovideanappreciationofthewideapplicationofthesedisciplineswithi nthescientificworld.
Syllabus
UNIT I
Matrices and Gaussian Elimination – Introduction, geometry of linear equations, Gaussian elimination,matrix multiplication, inverses and transposes.Vector spaces and Linear equations–Vector spaces and sub spaces, linear independence, basis and dimension, four fundamental subspaces.
UNIT II
Orthogonality-Perpendicularvectorsandorthogonal subspaces,innerproductsandprojectionsontolines,projectionsandleastsquareapplications,or thogonalbasis,orthogonalspaces,orthogonalmatrices,GramSchmidtorthogonalization,FFT.
UNIT III
Probability, compound probability and discrete random variable. Binomial, Normal and Poisson’s distributions, Sampling distribution, elementary concept of estimation and theory of hypothesis, recurred relations.
UNIT IV
EigenvaluesandEigenvectors– Introduction,diagonalformofamatrix,differenceequationsandthepowersofA^k,PositiveDefin iteMatrices-Minima,maximaandsaddlepoints,testsforpositivedefiniteness,semi- definiteandindefinitematrices,SingularValueDecomposition,Iterativemethods for Ax = b.
UNIT V
Introduction to special matrices - Fourier transforms: discrete and continuous, shiftmatrices and circulant matrices, Kronecker product, sine and cosine transforms from Kronecker sums, Toeplitz matrices and shift in variant filters, graphs and Laplacians and Kirchhoff' slaws, clustering by spectralmethodsandk- means,completingrankonematrices,orthogonalProcrustesproblem,distancematrices
Textbook/ References
Gilbert Strang, Linear Algebra and its Applications, Fourth Edition, Cambridge University Press. 2009. Gene H. Golub and V. Van Loan, Matrix Computations, Third Edition, John Hopkins University Press, Baltimore, 1996.
David C. Lay, Linear Algebra and Its Applications, Pearson Addison Wesley, 2002.
Strang, Gilbert. Linear algebra and learning from data. Cambridge: Wellesley-Cambridge Press, 2019.
Syllabus UNIT I
INTRODUCTION: Basic concepts of OOPs – Templates – Algorithm Analysis – ADT - List (Singly, Doubly and Circular) Implementation - Array, Pointer, Cursor Implementation
UNIT II
BASIC DATA STRUCTURES: Stacks and Queues – ADT, Implementation and Applications - Trees – General, Binary, Binary Search, Expression Search, AVL, Splay, B-Trees – Implementations - Tree Traversals.
UNIT III
ADVANCED DATA STRUCTURES: Set – Implementation – Basic operations on set – Priority Queue – Implementation - Graphs – Directed Graphs – Shortest Path Problem - Undirected Graph - Spanning Trees – Graph Traversals
UNIT IV
MEMORY MANAGEMENT; Issues - Managing Equal Sized Blocks - Garbage Collection Algorithms for Equal Sized Blocks - Storage Allocation for Objects with Mixed Sizes - Buddy Systems - Storage Compaction
UNIT V
SEARCHING, SORTING AND DESIGN TECHNIQUES: Searching Techniques, Sorting – Internal Sorting – Bubble Sort, Insertion Sort, Quick Sort, Heap Sort, Bin Sort, Radix Sort – External Sorting – Merge Sort, Multi-way Merge Sort, Polyphase Sorting - Design Techniques - Divide and Conquer - Dynamic Programming - Greedy Algorithm – Backtracking - Local Search Algorithms
Reference Books:
Mark Allen Weiss, “Data Structures and Algorithm Analysis in C++”, Pearson P
Aho, Hopcroft, Ullman, “Data Structures and Algorithms”, Pearson Education P
Drozdek, Data Structures and algorithm in Jawa, Cengage (Thomson)
Gilberg, Data structures Using C++, Cengage
Horowitz, Sahni, Rajasekaran, “Computer Algorithms”, Galgotia,
Tanenbaum A.S., Langram Y, Augestien M.J., “Data Structures using C & C++”, Prentice Hall of India, 2002
Course Outcomes:
After completing the course student should be able to:
Describe in-depth about theories, methods, and algorithms in machine learning.
Find and analyze the optimal hyper parameters of the machine learning algorithms.
Examine the nature of a problem at hand and determine whether a machine learning can solve it efficiently enough.
Solve and implement the real-world problems using machine learning.
Introduction to machine learning (ML): Basics of ML, History of ML, Evolution of ML, ML Models, Learning and testing models, ML Algorithm and Convergence, MLTechniques, Types of ML, supervised and unsupervised learning, classification and clustering, Applications of ML,Bias-Variance tradeoff.
Neural Networks: McCulloch Pitts Neuron models,Activation Functions, Loss Functions, perceptron,Gradient Descent,Multilayer neural networks: back-propagation, backpropagation calculus, Initialization, Training rules, issues in back-propagation, Bayesian Learning,Competitive learning and self-organization map.
Support Vector Machines(SVM):SVM Formulation, Interpretation & Analysis, hard and soft margin, Hinge loss, SVM dual, SVM tuning parameters, SVM Kernels, twin SVM.
Clustering: K-Means Clustering, Mean Shift Clustering, Aagglomerative clustering, Association Rule Mining, Partition Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering, Gaussian Mixture Models, and Expectation Maximization. Parameters estimations – MLE,MAP.
Learning Theory: Probably Approximately Correct (PAC) Model, PAC Learnability, Agnostic PAC Learning, Theoretical analysis of machine learning problems and algorithms,Generalization error bounds,VC Model,MLTools.
Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Leonard Kaufman and P. J. Rousseau. Finding groups in data: An introduction to cluster analysis, Wiley, 2005
NelloCristianini and John Shawe-Taylor, An Introduction to Support Vector Machines,Cambridge University Press, 2000.
Bernhard Schölkopf and Alexander J. Smola, Learning with Kernels, MIT Press, 2002.
Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning:From Theory to Algorithms, Cambridge University Press.,2014
,Distribution properties and arithmetic Samples/CLT, Basic machine learning algorithms ,Linear regression ,SVM, Naive Bayes.
Cathy O’Neil and Rachel schutt ,Dong Data Science, Straight Talk from the Frontline. O'Reilly.
Jure Leskovek, Anand Rajaraman and Jeffrey Ullman . Mining of Massive Datasets. V2.1. Cambridge University Press
UNIT-I Introduction to Big Data. What is Big Data. Why Big Data is Important. Meet Hadoop. Data. Data Storage and Analysis. Comparison with other systems. Grid Computing. A brief history of Hadoop. Apache hadoop and the Hadoop EcoSystem. Linux refresher; VMWare Installation of Hadoop.
UNIT-II The design of HDFS. HDFS concepts. Command line interface to HDFS. Hadoop File systems. Interfaces. Java Interface to Hadoop. Anatomy of a file read. Anatomy of a file write. Replica placement and Coherency Model. Parallel copying with distcp, Keeping an HDFS cluster balanced.
UNIT-III (Introduction. Analyzing data with unix tools. Analyzing data with hadoop. Java MapReduce classes (new API). Data flow, combiner functions, Running a distributed MapReduce Job. Configuration API. Setting up the development environment. Managing configuration. Writing a unit test with MRUnit. Running a job in local job runner. Running on a cluster.Launching a job. The MapReduce WebUI.
UNIT-IV Classic Mapreduce. Job submission. Job Initialization. Task Assignment. Task execution
.Progress and status updates. Job Completion. Shuffle and sort on Map and reducer side. Configuration tuning. MapReduce Types. Input formats. Output formats ,Sorting. Map side and Reduce side joins.
UNIT-V The Hive Shell. Hive services. Hive clients. The meta store. Comparison with traditional databases. HiveQl. Hbasics. Concepts. Implementation. Java and Mapreduce clients. Loading data, web queries.
TEXT BOOKS:
Tom White, Hadoop, “The Definitive Guide”, 3rd Edition, O’Reilly Publications, 2012.
Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch , “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGrawHill Osborne Media; 1 edition, 2011.
- Relevance Scoring and ranking for Web - Similarity - Hadoop & Map Reduce - Evaluation - Personalized search - Collaborative filtering and content-based recommendation of documents And products - handling invisible Web - Snippet generation, Summarization. Question Answering, Cross- Lingual Retrieval.
C. Manning, P. Raghvan and H Schutze: Introduction to Information Retrieval, Cambridge University Press, 2008.
Ricardo Baeza -Yates and Berthier Ribeiro –Neto, Modern Information Retrieval The Concepts and Technology behind Search 2nd Edition, ACM Press Books 2011.
Bruce Croft, Donald Metzler and Trevor Strohman Search Engines Information Retrieval in Practice 1st Edition Addison Wesley, 2009
Mark Levene, An Introduction to Search Engines and Web Navigation, 2nd Edition Wiley 2010.
Data Mining Techniques; Arun K.Pujari ; University Press.
Data Mining; Adriaans&Zantinge; Pearson education.
Mastering Data Mining; Berry Linoff; Wiley.
Data Mining; Dunham; Pearson education. 5. Text Mining Applications, Konchandy, Cengage
CourseObjectives
Thecoursewilllay downthebasicconceptsand techniquesoflinearalgebraandcalculusneededforsubsequent study.
Thecoursewillexploretheconceptsinitiallythroughcomputationalexperimentsandt hentrytounderstandthe conceptsand theory behind it.
Thecoursewillprovideanappreciationofthewideapplicationofthesedisciplineswithi nthescientificworld.
Syllabus
UNIT I
Matrices and Gaussian Elimination – Introduction, geometry of linear equations, Gaussian elimination,matrix multiplication, inverses and transposes.Vector spaces and Linear equations–Vector spaces and sub spaces, linear independence, basis and dimension, four fundamental subspaces.
UNIT II
Orthogonality-Perpendicularvectorsandorthogonal subspaces,innerproductsandprojectionsontolines,projectionsandleastsquareapplications,or thogonalbasis,orthogonalspaces,orthogonalmatrices,GramSchmidtorthogonalization,FFT.
UNIT III
Probability, compound probability and discrete random variable. Binomial, Normal and Poisson’s distributions, Sampling distribution, elementary concept of estimation and theory of hypothesis, recurred relations.
UNIT IV
EigenvaluesandEigenvectors– Introduction,diagonalformofamatrix,differenceequationsandthepowersofA^k,PositiveDefin iteMatrices-Minima,maximaandsaddlepoints,testsforpositivedefiniteness,semi- definiteandindefinitematrices,SingularValueDecomposition,Iterativemethods for Ax = b.
UNIT V
Introduction to special matrices - Fourier transforms: discrete and continuous, shiftmatrices and circulant matrices, Kronecker product, sine and cosine transforms from Kronecker sums, Toeplitz matrices and shift in variant filters, graphs and Laplacians and Kirchhoff' slaws, clustering by spectralmethodsandk- means,completingrankonematrices,orthogonalProcrustesproblem,distancematrices
Textbook/ References
Gilbert Strang, Linear Algebra and its Applications, Fourth Edition, Cambridge University Press. 2009. Gene H. Golub and V. Van Loan, Matrix Computations, Third Edition, John Hopkins University Press, Baltimore, 1996.
David C. Lay, Linear Algebra and Its Applications, Pearson Addison Wesley, 2002.
Strang, Gilbert. Linear algebra and learning from data. Cambridge: Wellesley-Cambridge Press, 2019.
Syllabus UNIT I
INTRODUCTION: Basic concepts of OOPs – Templates – Algorithm Analysis – ADT - List (Singly, Doubly and Circular) Implementation - Array, Pointer, Cursor Implementation
UNIT II
BASIC DATA STRUCTURES: Stacks and Queues – ADT, Implementation and Applications - Trees – General, Binary, Binary Search, Expression Search, AVL, Splay, B-Trees – Implementations - Tree Traversals.
UNIT III
ADVANCED DATA STRUCTURES: Set – Implementation – Basic operations on set – Priority Queue – Implementation - Graphs – Directed Graphs – Shortest Path Problem - Undirected Graph - Spanning Trees – Graph Traversals
UNIT IV
MEMORY MANAGEMENT; Issues - Managing Equal Sized Blocks - Garbage Collection Algorithms for Equal Sized Blocks - Storage Allocation for Objects with Mixed Sizes - Buddy Systems - Storage Compaction
UNIT V
SEARCHING, SORTING AND DESIGN TECHNIQUES: Searching Techniques, Sorting – Internal Sorting – Bubble Sort, Insertion Sort, Quick Sort, Heap Sort, Bin Sort, Radix Sort – External Sorting – Merge Sort, Multi-way Merge Sort, Polyphase Sorting - Design Techniques - Divide and Conquer - Dynamic Programming - Greedy Algorithm – Backtracking - Local Search Algorithms
Reference Books:
Mark Allen Weiss, “Data Structures and Algorithm Analysis in C++”, Pearson P
Aho, Hopcroft, Ullman, “Data Structures and Algorithms”, Pearson Education P
Drozdek, Data Structures and algorithm in Jawa, Cengage (Thomson)
Gilberg, Data structures Using C++, Cengage
Horowitz, Sahni, Rajasekaran, “Computer Algorithms”, Galgotia,
Tanenbaum A.S., Langram Y, Augestien M.J., “Data Structures using C & C++”, Prentice Hall of India, 2002
Course Outcomes:
After completing the course student should be able to:
Describe in-depth about theories, methods, and algorithms in machine learning.
Find and analyze the optimal hyper parameters of the machine learning algorithms.
Examine the nature of a problem at hand and determine whether a machine learning can solve it efficiently enough.
Solve and implement the real-world problems using machine learning.
Introduction to machine learning (ML): Basics of ML, History of ML, Evolution of ML, ML Models, Learning and testing models, ML Algorithm and Convergence, MLTechniques, Types of ML, supervised and unsupervised learning, classification and clustering, Applications of ML,Bias-Variance tradeoff.
Neural Networks: McCulloch Pitts Neuron models,Activation Functions, Loss Functions, perceptron,Gradient Descent,Multilayer neural networks: back-propagation, backpropagation calculus, Initialization, Training rules, issues in back-propagation, Bayesian Learning,Competitive learning and self-organization map.
Support Vector Machines(SVM):SVM Formulation, Interpretation & Analysis, hard and soft margin, Hinge loss, SVM dual, SVM tuning parameters, SVM Kernels, twin SVM.
Clustering: K-Means Clustering, Mean Shift Clustering, Aagglomerative clustering, Association Rule Mining, Partition Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering, Gaussian Mixture Models, and Expectation Maximization. Parameters estimations – MLE,MAP.
Learning Theory: Probably Approximately Correct (PAC) Model, PAC Learnability, Agnostic PAC Learning, Theoretical analysis of machine learning problems and algorithms,Generalization error bounds,VC Model,MLTools.
Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Leonard Kaufman and P. J. Rousseau. Finding groups in data: An introduction to cluster analysis, Wiley, 2005
NelloCristianini and John Shawe-Taylor, An Introduction to Support Vector Machines,Cambridge University Press, 2000.
Bernhard Schölkopf and Alexander J. Smola, Learning with Kernels, MIT Press, 2002.
Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning:From Theory to Algorithms, Cambridge University Press.,2014
,Distribution properties and arithmetic Samples/CLT, Basic machine learning algorithms ,Linear regression ,SVM, Naive Bayes.
Cathy O’Neil and Rachel schutt ,Dong Data Science, Straight Talk from the Frontline. O'Reilly.
Jure Leskovek, Anand Rajaraman and Jeffrey Ullman . Mining of Massive Datasets. V2.1. Cambridge University Press
UNIT-I Introduction to Big Data. What is Big Data. Why Big Data is Important. Meet Hadoop. Data. Data Storage and Analysis. Comparison with other systems. Grid Computing. A brief history of Hadoop. Apache hadoop and the Hadoop EcoSystem. Linux refresher; VMWare Installation of Hadoop.
UNIT-II The design of HDFS. HDFS concepts. Command line interface to HDFS. Hadoop File systems. Interfaces. Java Interface to Hadoop. Anatomy of a file read. Anatomy of a file write. Replica placement and Coherency Model. Parallel copying with distcp, Keeping an HDFS cluster balanced.
UNIT-III (Introduction. Analyzing data with unix tools. Analyzing data with hadoop. Java MapReduce classes (new API). Data flow, combiner functions, Running a distributed MapReduce Job. Configuration API. Setting up the development environment. Managing configuration. Writing a unit test with MRUnit. Running a job in local job runner. Running on a cluster.Launching a job. The MapReduce WebUI.
UNIT-IV Classic Mapreduce. Job submission. Job Initialization. Task Assignment. Task execution
.Progress and status updates. Job Completion. Shuffle and sort on Map and reducer side. Configuration tuning. MapReduce Types. Input formats. Output formats ,Sorting. Map side and Reduce side joins.
UNIT-V The Hive Shell. Hive services. Hive clients. The meta store. Comparison with traditional databases. HiveQl. Hbasics. Concepts. Implementation. Java and Mapreduce clients. Loading data, web queries.
TEXT BOOKS:
Tom White, Hadoop, “The Definitive Guide”, 3rd Edition, O’Reilly Publications, 2012.
Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch , “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGrawHill Osborne Media; 1 edition, 2011.
- Relevance Scoring and ranking for Web - Similarity - Hadoop & Map Reduce - Evaluation - Personalized search - Collaborative filtering and content-based recommendation of documents And products - handling invisible Web - Snippet generation, Summarization. Question Answering, Cross- Lingual Retrieval.
C. Manning, P. Raghvan and H Schutze: Introduction to Information Retrieval, Cambridge University Press, 2008.
Ricardo Baeza -Yates and Berthier Ribeiro –Neto, Modern Information Retrieval The Concepts and Technology behind Search 2nd Edition, ACM Press Books 2011.
Bruce Croft, Donald Metzler and Trevor Strohman Search Engines Information Retrieval in Practice 1st Edition Addison Wesley, 2009
Mark Levene, An Introduction to Search Engines and Web Navigation, 2nd Edition Wiley 2010.
Data Mining Techniques; Arun K.Pujari ; University Press.
Data Mining; Adriaans&Zantinge; Pearson education.
Mastering Data Mining; Berry Linoff; Wiley.
Data Mining; Dunham; Pearson education. 5. Text Mining Applications, Konchandy, Cengage