Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Recent Downtime / Post Mortem Analysis
#1
Post-Mortem Analysis
Incident Title: Forum Down
Date & Time of Incident: 2025-02-23 02:01:00 (UTC) to 2025-02-23 10:43:00 (UTC)
Prepared by: Q

1. Summary
  • The hard drive became full, rejecting any write operation.
  • The Webserver stopped normal operation, yielding "502 Bad Gateway" errors.
  • The Forum was down.

2. Root Cause Analysis
  • Logging and Backups made the hard drive, which was at it's limit for some time, become full.


3. Timeline
Time (UTC)/Event Description
  • 02:01 dxasmodeus reported the issue, via an external communication tool.
  • 10:31 Q started working on the server
  • 10:37 Q determined the cause, being the full hard drive.
  • 10:43 Q made space on the hard drive and restarted the server

4. Resolution & Actions Taken
  • Old backups were deleted
  • Server was restarted.

5. Immediate preventative Measures
  • Make additional space on webserver
  • Add a command to backend login, that tells the amount of left space in MegaBytes
  • Add a file `buffer-storage-blocker-for-emergencies.dat` to the root of the filesystem, to that in the future such space can be freed even quicker.

6. Further possible Mitigations:
  • Create some Alerting capability, to notify the responsible Admins, through e.g. Email.
  • Mounting variable size directories on a separate volume (e.g. the uploads, the log files and/or the database)

6. Lessons Learned
  • Communication among staff worked well.
  • There was an absence of monitoring/alerting, that made the little hard drive not as quickly visible.
  • There is a permanent lack of storage on the server.

7. Action Items
Task/Owner/Deadline/Status
  • Space on Webserver / Q / 2025-02-23 / Done
  • Add Command / Q / 2025-03-05 / Done
  • Add File / Q / 2025-02-23 / Done
  • Separate Data to a separate Volume/Mount Point / swammy+Q / 2025-02-27 / Done 
  • Complete Post-Mortem-Analysis / Q / 2025-03-02 / Done 
  • Consider further Actions&Mitigations / Futanari Staff / 2025-03-09 / Open
Reply
#2

Thanks for the update!

Reply
#3
(24th February 2025, 19:29)Graciela L Wrote: Thanks for the update!

Thank you for staying around Smile

I chatted with our Owner swammy yesterday and we doubled the hard drive to 100 GB, for roughly 2 Dollars a month. It should give us respite on the storage front for a good while now.

Additionally, now that we run way below 90 % hard drive capacity, write operations should be quite quick again.

I will also add the topic of automated alerting mails for consideration.
Reply
#4

I need to do major changes to the Server, resizing the hard drive, today at 19:30 (UTC). Hopefully this will only take 30 minutes.

Reply
#5

Making the necessary changes turned out to be more troublesome.

We have more storage space now.

Reply
#6

You are putting a lot of work into this. Thanks for the time you are spending. More storage space is good. Haven't seen Swammy in a little bit though.

Reply
#7
(1st March 2025, 00:26)Keahi19 Wrote: You are putting a lot of work into this. Thanks for the time you are spending. More storage space is good. Haven't seen Swammy in a little bit though.

Thank you.


Yeah, we have different roles among Futanari Palace Staff, due to our personal preferences, e.g. our Owner Swammy is a little laid back, we chat with him every few days or so. He's doing fine. Then we have dxasmodeus, who's keeping up with the forum. I am very happy, that he cares about content of posts and rules. And me in turn, Q, I care about the technology.

I do have some Agenda. Which is very little Agenda, if I dare may hope, a humble approach:

Keeping things steady, and running nicely. Such that our community and it's creations may live on.

To that end we are gathering some structured records on what to do; and how to approach it. For example here a screenshot of some recent topics, about which we are gathering some thoughts:

   


Kind Regards, Q
Reply
#8

So guys, I know I'm not very active here. There are some reasons (besides the fact that I'm just a laid back retired navy veteran) but I'm in very regular contact with Q. We chat on a very regular basis on Discord. So, I'm not exactly completely gone.

However, I wanted to make sure everyone knows. I've got the ownership and payments for keeping the server alive covered. No one needs to worry about that as long as I am still alive and I have wonderful people like Q helping me out on keeping this old community alive.

Reply
#9

Glad to hear, and hope we can get things worked out as you work on this. ^_^

Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)