Big Data Hugh volumes of data which cannot be handled using traditional database programming. Characterized by volume, variety, and velocity. Data Science A data-focused on scientific activity. Approaches to process big data. Harnesses the potential of big data for business decisions. Similar to data mining. Concept Big Data Diverse data types generated from multiple data…
As we’re moving towards the digital world — cybersecurity is getting a critical part of our life. When we talk about security in digital life also the main challenge is to find the abnormal activity. When we make any transaction while buying any product online — a good amount of people prefer credit cards. The…
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into different clusters. It allows us to cluster the info into different groups and a convenient way to discover the categories of groups in the unlabeled dataset on its own without the need for any training. The k-means clustering algorithm mainly performs two…
Density-based special clustering of applications with noise or DBSCAN is a density-based clustering method that calculates how dense the neighborhood of a data point is. the main idea behind DBSCAN is that a point belongs to a cluster if it is close to many from that cluster. It will measure the similarity between data points,…
Support vector Machine or SVM is a Supervised Learning algorithm, which is used for Classification and Regression problems. However, primarily, it is used for classification problems in Machine Learning. The goal of the SVM algorithm is to create the decision boundary that can segregate n-dimensional space into classes so that we can easily classify new…
Breast cancer (BC) is one among the foremost common cancers among ladies worldwide, representing the bulk of recent cancer cases and cancer-related deaths in line with world statistics, creating it a major public ill health in today’s society. The early diagnosing of BC will improve the prognosis and probability of survival considerably, because it will promote timely clinical treatment to patients. any correct classification of benign tumors will stop patients undergoing supernumerary…
Random Forest is better than Decision Tree as the greater number of trees in the forest leads to higher accuracy and prevents the problem of overfitting. Algorithm working Random Forest uses a Bagging technique with one modification, where subset of features are used for finding best split. Advantages & Disadvantages Disadvantages Popular Posts
Project 1: Heart Disease Detection Machine Learning is used across numerous spheres around the world. The healthcare industry is no exception. Machine Learning can play an essential part in predicting presence/ absence of Locomotor diseases, Heart conditions and further. similar information, if predicted well in advance, can give important perceptivity to doctors who can also…
Analogy behind KNN: tell me about your friend (who your neighbors are) and I will tell you who you are. Algorithms Common distance function measure used for continuous variables. KNN Working The k-NN working can be explained on the basis of the below algorithm: Step-1: select the number of K of the neighbors Step-2: Calculate…
Tree based algorithms are a popular family of related non-parametric and supervised methods for both classification and regression. The decision tree looks like a upside-down tree with a decision rule at the root, from which subsequent decision rules spread out below. Sometimes decision trees are also referred to as CART, which is short for classification…
Machine learning relies on AI to predict the future based on past data. If you’re a data science enthusiast or practitioner, this article will help you build your own end-to-end machine learning project from scratch. In order for a machine-learning project to be successful, there are several steps that should be followed. These steps vary…
Logistic Regression is one of the most used Machine learning algorithms among industries and academia. It is a supervised learning algorithm used for classification where the target variable should be categorical. Why not Linear Regression for classification There are mainly two reasons for not fitting a linear regression on classification tasks: When we fit a…