I’m trying to move away from cron jobs, not that they don’t work, but I want to get on with the times and also learn some things.

I created two user timers (and the associated services), one for backing up my data and the second to upload to B2. I’m using two scripts I had in my cron jobs for a few years and they worked without problems. But with systemd timers both scripts fail with exit code 15 (process terminated) and I have no idea why.

I run Debian 12 Bookworm.

Here’s the output for the status of the upload service:

> systemctl --user status rclone-up.service
○ rclone-up.service - Run rclone up for b2
     Loaded: loaded (/home/clmbmb/.config/systemd/user/rclone-up.service; disabled; preset: enabled)
     Active: inactive (dead)
TriggeredBy: ● rclone-up.timer

Apr 11 06:10:39 tesla systemd[1698218]: Starting rclone-up.service - Run rclone up for b2...
Apr 11 06:12:18 tesla systemd[1698218]: rclone-up.service: Main process exited, code=killed, status=15/TERM
Apr 11 06:12:18 tesla systemd[1698218]: rclone-up.service: Failed with result 'signal'.
Apr 11 06:12:18 tesla systemd[1698218]: Stopped rclone-up.service - Run rclone up for b2.
Apr 11 06:12:18 tesla systemd[1698218]: rclone-up.service: Consumed 12.811s CPU time.

Also, here’s the log created by rclone while running:

2024/04/11 06:10:42 INFO  : integrity.2376: Copied (new)
2024/04/11 06:10:43 INFO  : hints.2376: Copied (new)
2024/04/11 06:10:43 INFO  : nonce: Copied (replaced existing)
2024/04/11 06:10:47 INFO  : config: Updated modification time in destination
2024/04/11 06:10:55 INFO  : index.2376: Copied (new)
2024/04/11 06:11:40 INFO  :
Transferred:      443.104 MiB / 2.361 GiB, 18%, 16.475 MiB/s, ETA 1m59s
Checks:              1503 / 1503, 100%
Transferred:            4 / 19, 21%
Elapsed time:       1m0.8s
Transferring:
 *                                   data/2/2328: 19% /502.259Mi, 2.904Mi/s, 2m19s
 *                                   data/2/2329: 52% /500.732Mi, 10.758Mi/s, 22s
 *                                   data/2/2330: 14% /501.598Mi, 3.150Mi/s, 2m15s
 *                                   data/2/2331:  0% /500.090Mi, 0/s, -

2024/04/11 06:12:18 INFO  : Signal received: terminated

Where should I look to get some more information about what’s going on? Why would the service be terminated like that?

LE:

Setting TimeoutSec=infinity inside the [Service] section of the unit file seems to help. Not 100% if it’s a good idea, but I’ll experiment with it.

    • calm.like.a.bomb@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 months ago
      [Unit]
      Description=Run rclone up for b2
      
      [Service]
      Type=oneshot
      ExecStart=/zet/Users/radu/backup/rclone_up.sh
      
      [Install]
      WantedBy=rclone-up.timer
      
      • somethingsomethingidk@lemmy.world
        link
        fedilink
        arrow-up
        6
        ·
        7 months ago

        You could try setting TimeoutStopSec=“infinity” for the service. There may be a default timeout for services and its killing rclone before it can finish because the oneshot type is considered “starting” until the program exits.

        • Oinks
          link
          fedilink
          arrow-up
          7
          ·
          7 months ago

          TimeoutStopSec applies to the ExecStop command, TimeoutStartSec would be the culprit here. I’m not sure why there would be a default timeout of specifically 1:39 minutes though.

      • Oinks
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        7 months ago

        Does your script fork at some point (and might exit before the rsync job is completed)? Because then you need to use Type=forking instead of simple or oneshot, otherwise systemd will start trying to clean up child processes when the script exits.

        Edit: Actually considering the time span involved Type=forking will not solve your issue because it will timeout, if this is the problem you need to change your script to not do that.

        • calm.like.a.bomb@lemmy.dbzer0.comOP
          link
          fedilink
          English
          arrow-up
          5
          ·
          7 months ago

          No, my script doesn’t fork and I don’t think rclone does that either.

          Here’s the script (pretty simple):

          #!/bin/bash
          
          repos=(fotorepo multirepo persorepo appconfigs)
          
          if pidof -x rclone >/dev/null; then
            echo "Process already running"
            exit 1
          fi
          
          for repo in "${repos[@]}"; do
              inidate=$(date +'%Y%m%d_%H%M%S')
              /usr/bin/rclone -v --log-file=/backup/borg/logs/${repo}_b2sync_${inidate}.log sync /backup/borg/${repo} b2:${repo}
              if [[ $? -eq 0 ]]; then
                MSGDATE=$(date +'%d/%m/%y %T')
                mesaj="[${MSGDATE}] Upload for ${repo} was successful."
                curl -H "Title: B2 Upload" -H "Tags: arrow_double_up" -d "${mesaj}" "${URL}"
                #sendmsg "[${MSGDATE}] Upload for <b>${repo}</b> was <b><u>successful</u></b>."
              else
                MSGDATE=$(date +'%d/%m/%y %T')
                mesaj="[${MSGDATE}] Upload for ${repo} has failed. Check the logs."
                curl -H "Title: B2 Upload" -H "Tags: warning" -H "prio:high" -d "${mesaj}" "${URL}"
                #sendmsg "[${MSGDATE}] Upload for <b>${repo}</b> has <b><u>failed</u></b>. Check the logs."
              fi
              enddate=$(date +'%Y%m%d_%H%M%S')
              mv /backup/borg/logs/${repo}_b2sync_${inidate}.log /backup/borg/logs/${repo}_b2sync_${enddate}.log
          done
          
          • Oinks
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            7 months ago

            Indeed, that all looks fairly innocuous. Just in case, you are sure that you didn’t just accidentially kill or killall rclone or bash?

            Perhaps wrapping the script in strace might help debug where the offending signal is coming from.

            • calm.like.a.bomb@lemmy.dbzer0.comOP
              link
              fedilink
              English
              arrow-up
              4
              ·
              7 months ago

              Just in case, you are sure that you didn’t just accidentially kill or killall rclone or bash?

              No. The process runs at night. Only if my dog started learning Linux and tested something! That makes me wonder…

  • Magickmaster@feddit.de
    link
    fedilink
    arrow-up
    5
    ·
    7 months ago

    I think I read something about there being timeouts for slow jobs, which backups definitely are. Really unsure though, it has been a while. Try spawning independent worker programs!

    • calm.like.a.bomb@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 months ago

      I just read a bit some internet posts and the documentation. I set TimeoutSec=infinity inside the service and I set the timer to run it now. I’ll see if that helps.

  • orcrist@lemm.ee
    link
    fedilink
    arrow-up
    4
    ·
    7 months ago

    Nothing wrong with learning new tricks, but it’s worth mentioning on the side that sometimes a cron job is the right tool.

  • rollingflower@lemmy.kde.social
    link
    fedilink
    Deutsch
    arrow-up
    2
    ·
    edit-2
    7 months ago

    Here is my template

    sudo cat > /etc/systemd/user/rsync-backup.service <<EOF
    [Unit]
    Description=do rsync backups with some conditions
    # After=network-online.target
    
    [Service]
    Type=oneshot
    # require a power connection (optional)
    # ExecStartPre=sh -c '[ $(cat /sys/class/power_supply/AC/online) = 1 ]'
    
    # require battery over 40%
    # ExecStartPre=sh -c '[ $(cat /sys/class/power_supply/BAT0/capacity) -ge 40 ]'
    
    # require the connected network to NOT be "metered"
    # ExecStartPre=sh -c '! $(nmcli -t -f GENERAL.METERED dev show | grep -q 'yes')'
    
    ExecStart=/home/user/.local/bin/rsync-backup
    # you might add everything you need
    # ExecStart=/path/to/something/else
    
    # delete old logs (disabled for testing)
    # ExecStartPost=rm -f /var/log/rsync-backups.log
    # log the updates
    # ExecStartPost=sh -c 'echo "Last backup: $(date)" > /var/log/rsync-backup.log'
    # write errors to log
    StandardError=file:/var/log/rsync-backups.log
    
    # GUI message
    #ExecStartPost=/usr/bin/notify-send -t 0 -a "Backup" "rsync backup finished." "$(output of some command if you want infos about the backup)"
    
    # run with low priority, when idling
    # Nice=15
    IOSchedulingClass=idle
    
    # when conditions were not met, try again after 15 minutes
    # Restart=on-failure
    # RestartSec=900
    
    [Install]
    WantedBy=multi-user.target
    EOF
    

    Timer file:

    sudo cat > /etc/systemd/user/rsync-backup.timer <<EOF
    [Unit]
    Description=do rsync backups with some conditions
    
    [Timer]
    OnCalendar=daily
    Persistent=true
    EOF
    

    (I think the unit is needed)

    That is a slightly modified variant of my automatic rpm-ostree system updates which took an hour or so with the help of ChatGPT and a lot of testing around.

    Systemd services are lit.

    If you add a “repeat when conditions are not met” you need another timer to start it. Like 2 loops, one big loop to start the process, and one small loop to keep trying until conditions are met. I do that with my system updates to prevent them being done

    • with low battery (or even using an AC requirement)
    • over a metered network
    • when the system is busy