User Tools

Site Tools


accounting_troubleshooting

This is an old revision of the document!


Accounting Troubleshooting

Luego de la actualización de Lenovo, nuestro mecanismo para informar a usuarios sobre su consumo de horas se vió afectado, ya que el mismo dependía del comando scontrol show assoc_mgr.

[bbruzzo@snmgt01 ~]$ scontrol show assoc_mgr | grep -A 7 "QOS=qos_pisca_145("
QOS=qos_pisca_145(122)
    UsageRaw=0.000000
    GrpJobs=N(0) GrpJobsAccrue=N(0) GrpSubmitJobs=N(0) GrpWall=N(0.00)
    GrpTRES=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu=N(0),gres/gpu:v100=N(0),gres/gpumem=N(0),gres/gpuutil=N(0)
    GrpTRESMins=cpu=6000000(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu=60000(0),gres/gpu:v100=N(0),gres/gpumem=N(0),gres/gpuutil=N(0)
    GrpTRESRunMins=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu=N(0),gres/gpu:v100=N(0),gres/gpumem=N(0),gres/gpuutil=N(0)
    MaxWallPJ=
    MaxTRESPJ=

Revisión de Database

Para hacer un dump de la database, desde mmgt02:

sudo mysqldump --single-transaction --databases slurm_acct_db > backup.sql

Parsear desde sacct

sacct -X -a -A pisca_73 --starttime=2025-01-01 --parsable2 --noheader --format=elapsedraw,ncpus | awk -F'|' '{sum+=$1*$2} END {print sum/3600}'

Python Script reporte horas

#!/usr/bin/env python3.10

import subprocess

def get_accounts():
    command = ['sacctmgr', '--noheader', 'list', 'account', 'format=account']
    output = subprocess.run(command,capture_output=True,encoding='utf-8')
    accounts = output.stdout.split() 
    return accounts

def get_hours(account):
    command = ['sacct', '-X', '-a', '-A', str(account), '--starttime=2025-01-01', '--parsable2', '--noheader', '--format=elapsedraw,ncpus']
    pipe_command= ['awk', '-F|', '{sum+=$1*$2} END {print sum/3600}']

    proc = subprocess.Popen(command,stdout=subprocess.PIPE)
    pipe_proc = subprocess.Popen(pipe_command,stdin=proc.stdout,stdout=subprocess.PIPE,encoding='utf-8')
    stdout,stderr = pipe_proc.communicate()

    print(account)
    print(stdout)

if __name__ == '__main__':
    accounts = get_accounts()

    for account in accounts:
        match account:
            case account if account.startswith(('pad','pci','pisca')):
                get_hours(account)
accounting_troubleshooting.1770146601.txt.gz · Last modified: by bbruzzo