A superpixel segment is a group of pixels that carry similar information. The Simple Linear Iterative Clustering (SLIC) is a well-known algorithm for generating superpixels that offers a good balance between accuracy and efficiency. Nevertheless, due to its high computational requirements, the algorithm does not meet the demands of real-time embedded applications in terms of speed and resources. This paper proposes a fully-pipelined FPGA architecture of SLIC, dubbed FP-SLIC, that exhibits 1) a simplified and efficient algorithm of reduced computational complexity that facilitates algorithm development for FPGAs, 2) a fully pipelined FPGA design operating at 40MHz with a throughput of one pixel per cycle, and 3) a memory-efficient architecture that eliminates the requirement for external memory. Implementation results achieve 259 fps on the BSDS500 dataset, which is ≈ 8.6× more than the requirement for real-time performance (30 frames per second).