Speed improvements

The following tables present the average results of the benchmark ran to compare the performance of GaiaXPy v2.1.0 and GaiaXPy v2.0.1 (its immediate predecessor).

The benchmark, which consisted on running each of the tools over input files in different formats, was repeated 20 times to compute the average times. All of the files contained the same 5,000 sources.

Environment details

Processor: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz x 8

Memory: 16GB

Operating system: Ubuntu 22.04.2 LTS 64-bit

Disk model: WDC PC SN730 SDBQNTY-512G-1001

Package dependencies: astropy==5.3, numpy==1.25.0, pandas==2.0.2, scipy==1.10.1.

Input reading and output writing

The tables below show the average time it takes GaiaXPy to read a file of 5,000 rows in different formats, and also the time it takes each tool to write the corresponding output.

GaiaXPy v.2.0.1 (old version)

Format/Function

Input reading

Output writing

Calibrate

Convert

Generate

CSV

5.30s

4.16s

11.50s

0.06s

ECSV

5.21s

4.32s

12.72s

0.09s

FITS

3.14s

0.05s

0.30s

0.04s

XML binary*

36.55s

9.75s

31.44s

0.43s

XML plain

30.16s

9.94s

31.73s

0.49s

GaiaXPy v.2.1.0 (new version)

Format/Function

Input reading

Output writing

Calibrate

Convert

Generate

CSV

6.00s

4.29s

12.11s

0.11s

ECSV

5.91s

4.35s

12.42s

0.10s

FITS

3.51s

0.18s

0.27s

0.05s

XML binary*

34.09s

7.76s

25.28s

0.58s

XML plain

28.90s

7.84s

25.25s

0.54s

*GaiaXPy XML output is always plain.

As it can be observed, the input reading time is longer for certain formats in v2.1.0 compared to the previous version, and the same applies to output writing.

The variations in input reading time arise from the fact that the parsing of some matrices, which was previously performed during processing, is now executed during reading. This introduces a slight overhead in input reading but greatly accelerates the processing.

Regarding the output writing, the files written by GaiaXPy v2.1.0 are similar but not identical to the ones generated by its predecessor, as mentioned in Release Notes.

Reading and processing times

The tables below show the time taken to run each of the tools over input files of 5,000 rows in different formats. It includes both the reading and processing steps, but it excludes the time used to save the output to a file.

GaiaXPy v.2.0.1 (old version)

Format/Function

Calibrate

Convert

Generate (w/o error_corr) | Generate (w/ error_corr)

CSV

35.34s

60.02s

6.62s

13.17s

ECSV

35.57s

59.79s

6.57s

13.04s

FITS

33.37s

57.62s

4.48s

10.96s

XML binary

66.46s

90.52s

37.91s

44.42s

XML plain

60.08s

84.00s

31.59s

38.23s

GaiaXPy v.2.1.0 (new version)

Format/Function

Calibrate

Convert

Generate (w/o error_corr) | Generate (w/ error_corr)

CSV

9.36s

12.10s

6.58s

8.81s

ECSV

9.29s

12.02s

6.48s

8.78s

FITS

6.97s

9.51s

4.08s

6.37s

XML binary

37.27s

40.00s

34.49s

36.82s

XML plain

32.14s

34.89s

29.51s

31.76s

Speed up factors

Format/Function

Calibrate

Convert

Generate (w/o error_corr) | Generate (w/ error_corr)

CSV

3.78

4.96

1.01

1.49

ECSV

3.83

4.97

1.01

1.49

FITS

4.79

6.06

1.10

1.72

XML binary

1.78

2.26

1.10

1.21

XML plain

1.87

2.41

1.07

1.20

In general, the speed improvements when the input file is either XML binary or plain are smaller. This is likely due to the fact that most of the changes in the code were implemented in the processing step and not in the reading one. Reading XML is slow in general, and most of the execution time corresponds to reading these files. GaiaXPy uses Astropy to read these files, which may or may not improve its reading speeds in the future.

Processing times

The tables below show the results when only the processing step of each tool is considered (i.e. times for reading input and writing output are excluded).

GaiaXPy v.2.0.1 (old version)

Format/Function

Calibrate

Convert

Generate (w/o error_corr) | Generate (w/ error_corr)

CSV

30.04s

54.72s

1.32s

7.87s

ECSV

30.36s

54.58s

1.36s

7.83s

FITS

30.23s

54.48s

1.34s

7.82s

XML binary

29.91s

53.97s

1.36s

7.87s

XML plain

29.92s

53.84s

1.43s

8.07s

GaiaXPy v.2.1.0 (new version)

Format/Function

Calibrate

Convert

Generate (w/o error_corr) | Generate (w/ error_corr)

CSV

3.36s

6.10s

0.58s

2.81s

ECSV

3.38s

6.11s

0.57s

2.87s

FITS

3.46s

6.00s

0.57s

2.86s

XML binary

3.18s

5.91s

0.40s

2.73s

XML plain

3.24s

5.99s

0.61s

2.86s

Speed up factors

Format/Function

Calibrate

Convert

Generate (w/o error_corr)

Generate (w/ error_corr)

CSV

8.94

8.97

2.28

2.80

ECSV

8.98

8.93

2.39

2.73

FITS

8.74

9.08

2.35

2.73

XML binary

9.41

9.13

3.40

2.88

XML plain

9.23

8.99

2.34

2.82