Utilizing threads and other parallel execution techniques efficiently to achieve concurrency on multiple processors/cores is becoming more difficult as the complexity of engineering applications increases. While hardware performance and scalability in this environment have been well-studied, software and operating system aspects of parallel code execution deserve additional attention. This is especially the case for smaller multi-core architectures such as those found in desktop computers. A matrix-multiply application has been customized to generate a multi-threaded load for testing, to address issues associated with mixing a multi-threaded load with available Linux benchmarking tools. This application was executed with the UNIXBENCH benchmark test suite in this study to conduct experiments designed to reveal problem areas that should be considered when implementing applications on modern parallel computing architectures. The analysis covers five types of operations: CPU intensive, Inter-process communication with pipes, shell script execution, file I/O and System call overhead. The results indicate that shell script execution, file I/O and system call overhead had the most degradation in performance as the multi-threaded load was increased. Pipe-based communication (directly between processes) and CPU intensive operations tended to scale well as the load increased.