Description An intermittent instability was detected on the san12 storage infrastructure, resulting in severe latency and temporary freezes when accessing data over NFS. The issue occurred randomly, with storage access sometimes behaving normally and at other times experiencing delays lasting several minutes.
Root cause The incident was caused by abnormal behavior on a single disk within a ZFS mirrored vdev. This disk exhibited extreme I/O latency, temporarily blocking storage operations at the server level and preventing timely NFS responses to client requests.
Although the ZFS pool remained online and data integrity was preserved, these I/O stalls caused visible service disruptions for systems relying on NFS storage.
Corrective actions Identification of the faulty disk responsible for the I/O stalls
Preventive offlining of the affected disk (service continuity ensured by ZFS redundancy)
Immediate stabilization of NFS access
Enhanced monitoring of the storage subsystem
Planned replacement of the faulty disk
Current status All services are operational and stable No data loss occurred thanks to the redundant storage configuration.
Next steps Permanent replacement of the affected disk Scheduled reboot of the storage server to apply NFS performance optimizations Additional monitoring to detect early signs of storage degradation
We apologize for the inconvenience caused and thank you for your understanding.
All services are operational and stable No data loss occurred thanks to the redundant storage configuration.
Faulty disk replacement is currently in progress on the san12 storage system. The affected drive has been taken out of service and is being replaced with a spare disk. Thanks to storage redundancy, services remain operational and no data loss is expected. Monitoring continues during the resilvering process.
The storage pool is now stable and fully operational. The faulty disk has been replaced with a spare drive, and the data reconstruction process (resilvering) has completed successfully with no data loss.
A maintenance operation is planned to perform the physical replacement of the failed disk. Services remain available and under monitoring until this intervention.