|
Pages: [1]
|
 |
|
Author
|
Topic: Outline of "what seti@home code really does" (Read 605 times)
|
BenHer
|
I figured since I'm down to the two main time hogs in seti, I should figure out which functions do what to what buffers of data, and in what order. How long is this buffer's data needed? Which functions call which...how many loops do they do...etc? Here is the result...it explains why 'GetFixedPot' and 'analyze_pot' are the main CPU hogs (cache miss hogs really). analyze_seti
foreach fft/gauss pair { if chirp rate ind changes ChirpData - fills array ChirpedData[] CalcTrigArray // Now we will process the just chirped data... // We will break the entire ChirpedData[] into 'fftlen' chunks and do each one // as a block // 1. backward FFT over each block and output to WorkData[] // 2. compute power spectrum table over fft'ed output, store in PowerSpectrum[CurrentSub] // 3. if pot freq bin == -1, then try to find spike in PowerSpectrum[]
analyze_pot <- PowerSpectrum[] // Look for gaussians :: by looping through frequencies * FftLength GetFixedPoT() <- PowerSpectrum[] --> GaussPoT[] If fftlen were 1024...then for Frequency=1...GaussPot[] would contain GaussPoT[0] = spectrum[0] GaussPoT[1] = spectrum[1024] // cache miss GaussPoT[2] = spectrum[2048] // ...and so on frequency = 2 GaussPoT[0] = spectrum[1] GaussPoT[1] = spectrum[1025] GaussPoT[2] = spectrum[2049] GaussFit() <- GaussPoT[] // Look for pulses :: by looping through frequencies * FftLength // loop through time for each frequency. PulsePoTNum is used Build a PulsePoT[] table extracted from the PowerSpectrum[] buffer (similar to method in GetFixedPot above) - find_triplets on this PulsePoT[] - find_pulse(s) on this PulsePoT[]
// end of fft/gauss pair
|
|
|
Logged
|
|
|
|
Josef W. Segur
|
I figured since I'm down to the two main time hogs in seti, I should figure out which functions do what to what buffers of data, and in what order.
How long is this buffer's data needed? Which functions call which...how many loops do they do...etc?
Here is the result...it explains why 'GetFixedPot' and 'analyze_pot' are the main CPU hogs (cache miss hogs really)....
All true, but 5.17+ will have the transposed PowerSpectrum. Then all the data points for a frequency will be in a contiguous vector. Of course doing the transposition will have to deal with the same cache issue. I wonder how well the IPP functions ippmCopy_va_32f_SS() or ippmTranspose_m_32f() would do for that. The docs say that the matrix operations are meant for much smaller sizes, but I guess the routines may still be coded so autovectorization works well anyhow. If it seems I'm urging you to go on to the 5.17 source, that's true. Simon's 5.17 builds have so far not shown the expected improvement from transposition, if you could profile a 5.17 build that might indicate why not. Joe
|
|
|
Logged
|
|
|
|
|
Pages: [1]
|
|
|
|
Quote!
If there is a worse time for something to go wrong, it will happen then.- Murphy's Law
|
 |  |  |
|
Site Statistics |
Total Members: | 679 |
Total Posts: | 4,987 |
Total Topics: | 337 | Downloads |
Apps |
Windows R-1.x | 24,097 |
Windows R-2.0 | 19,319 |
Windows R-2.2 | 34,002 |
Linux 32bit 1.x | 6,262 |
Linux 32bit 2.2 | 3,793 |
Linux 64bit 2.2 | 1,273 |
Alpha/IA64 | 85 |
FreeBSD | 249 |
HPUX | 173 |
Subtotal: | 88,746 |
Source packs: | 3,407 |
Tool/WU packs: | 5,517 |
Total: | 114,563 | GBs dl'd: | 174.23 | Pages served |
Today: | 1,148 |
Total: | 2,112,402 |
(since 6/26/2006) |
171 Donations to S@H |
U.S. Dollars: | 3,190.59 |
Euros: | 830.90 |
Last 24h: | $ 0.00 |
Avg./24h: | $ 15.57 |
Estim. total: | $ 4,270.76 |
Latest Member: driftandflow |
| |
 | |  |
 |  |  |
|
Online users/last 15m
16 Guests, 2 Users
Papy-bohington, yosaryan 24 Members/last 24hPapy-bohington, yosaryan, Josef W. Segur, seti_britta, j_groothu, sunu, Devaster, NGConnect, Stefan Ledwina, ajs, Alex Kan, Yin Gang, Gecko_R7, mark henderson, WHRoeder, Fivestar Crashtest, firefox, ohiomike, DerToT, s52d, clk, Maxxx, Geek@Play, BluesilvergreenErektile Dysfunktion,
| |
 | |  |
|