Auto-scaling is one of the most important features in Cloud computing. This feature promises cloud computing customers the ability to best adapt the capacity of their systems to the load they are facing while maintaining the Quality of Service (QoS). This adaptation will be done automatically by increasing or decreasing the amount of resources being leveraged against the workload’s resource demands. There are two types and several techniques of auto-scaling proposed in the literature. However, regardless the type or technique of auto-scaling used, over-provisioning or under-provisioning problem is often observed. In this paper, we model the auto-scaling mechanism with the Stochastic Well-formed coloured Nets (SWN). The simulation of the SWN model allows us to find the state of the system (the number of requests to be dispatched, the idle times of the started resources) from which the auto-scaling mechanism must be operated in order to minimize the amount of used resources without violating the service-level agreements (SLA).