Effective task scheduling is crucial for achieving high performance in heterogeneous computing environments. Whiling scheduling Out-Tree task graphs, many previous heterogeneity based heuristic algorithms usually require high scheduling costs and may not deliver good quality schedules with lower costs. Aiming at the characteristics of Out-Tree task graphs and the features of heterogeneous environments and adopting the strategy based on expected costs and task duplications, this paper proposes a greedy scheduling algorithm, which, at each scheduling step, tries to guarantee not to increase the schedule length, schedules the current task onto the used processor which minimizes its execution finish time; meanwhile, takes load balances into account to economize the use of processors. The comparative experimental results show that the proposed algorithm has higher scheduling efficiency and robust performance, which could produce better schedule which has shorter schedule length and less number of used processors.