In this paper we develop the foundation of a new theory for decision trees based on new modeling of phenomena with soft numbers. Soft numbers represent the theory of soft logic that addresses the need to combine real processes and cognitive ones in the same framework. At the same time soft logic develops a new concept of modeling and dealing with uncertainty: the uncertainty of time and space. It is a language that can talk in two reference frames, and also suggest a way to combine them. In the classical probability, in continuous random variables there is no distinguishing between the probability involving strict inequality and non-strict inequality. Moreover, a probability involves equality collapse to zero, without distinguishing among the values that we would like that the random variable will have for comparison. This work presents Soft Probability, by incorporating of Soft Numbers into probability theory. Soft Numbers are set of new numbers that are linear combinations of multiples of ”ones” and multiples of ”zeros”. In this work, we develop a probability involving equality as a ”soft zero” multiple of a probability density function (PDF). We also extend this notion of soft probabilities to the classical definitions of Complements, Unions, Intersections and Conditional probabilities, and also to the expectation, variance and entropy of a continuous random variable, condition being in a union of disjoint intervals and a discrete set of numbers. This extension provides information regarding to a continuous random variable being within discrete set of numbers, such that its probability does not collapse completely to zero. When we developed the notion of soft entropy, we found potentially another soft axis, multiples of 0log(0), that motivates to explore the properties of those new numbers and applications. We extend the notion of soft entropy into the definition of Cross Entropy and Kullback–Leibler-Divergence (KLD), and we found that a soft KLD is a soft number, that does not have a multiple of 0log(0). Based on a soft KLD, we defined a soft mutual information, that can be used as a splitting criteria in decision trees with data set of continuous random variables, consist of single samples and intervals.