Experiments on mechanical deformable vocal folds replicas are important in physical studies of human voice production to understand the underlying fluid–structure interaction. At current date, most experiments are performed for constant initial conditions with respect to structural as well as geometrical features. Varying those conditions requires manual intervention, which might affect reproducibility and hence the quality of experimental results. In this work, a setup is described which allows setting elastic and geometrical initial conditions in an automated way for a deformable vocal fold replica. High-speed imaging is integrated in the setup in order to decorrelate elastic and geometrical features. This way, reproducible, accurate and systematic measurements can be performed for prescribed initial conditions of glottal area, mean upstream pressure and vocal fold elasticity. Moreover, quantification of geometrical features during auto-oscillation is shown to contribute to the experimental characterization and understanding.