Spatiotemporal information processing within the human brain is done by a joint task of neurons and synapses with direct optical inputs. Therefore, to mimic this neurofunction using photonic devices could be an essential step to design future artificial visual recognition and memory storage systems. Herein, we proposed and developed a proof-of-principle two-terminal device that exhibits key features of neuron (integration, leaky, and relaxation) and synapse (short- and long-term memory) together in response with direct optical input stimuli. Importantly, these devices with processing and memory features, are further effectively integrated to build an artificial neural network, which are enabled to do neuromorphic spatiotemporal image sensing. Our approach provides a simple but effective route to implement for an artificial visual recognition system, which also has applications in edge computing and the internet of things.