Meara (2005) developed the LLAMA tests as a free, language-neutral, user-friendly suite of aptitude tests incorporating four
separate elements: vocabulary learning (LLAMA_B), phonetic (implicit) memory (LLAMA_D), sound-symbol correspondence (LLAMA_E) and
grammatical inferencing (LLAMA_F) based on the standardised MLAT tests (Carroll & Sapon 1959). Recently, they have become
increasingly popular in L2 acquisition research (Grañena & Long 2013b). However, Meara has expressed concern about the wide
use of these tests without validity testing (cf. Grañena 2013a). To this end, we investigated several areas relating to the LLAMA
tests, i.e. (1) the role of gender in LLAMA test performance; (2) language neutrality; (3) the role of age; (4) the role of formal
education qualifications; (5) the effect of playing logic puzzles on LLAMA scores and (6) the effect of changing the test timings
to scores. 229 participants from a range of language backgrounds, aged 10–75 with various education levels, typologically distinct
L1s, and varying levels of multilingualism were tested. A subset of participants was also tested with varying timings for the
tests. The results showed that the LLAMA tests are gender and language neutral. The younger learners (10–11s) performed
significantly worse than the adults in the sound/symbol correspondence task (LLAMA_E). Formal education qualifications show a
significant advantage in 3 of the LLAMA subcomponents (B, E, F) but not the implicit measure (LLAMA_D). Playing logic puzzles did
not improve LLAMA test scores. The timings appear to be optimal apart from LLAMA_F, which could be shortened. We suggest that the
LLAMA aptitude tests are not significantly affected by these factors although researchers using these tests should be aware of the
possible impact of education level on some components of the tests.