• Skip to main content
  • Skip to search
  • Skip to footer
Cadence Home
  • This search text may be transcribed, used, stored, or accessed by our third-party service providers per our Cookie Policy and Privacy Policy.

  1. Community Forums
  2. Custom IC SKILL
  3. What is the most efficient approach to record each line...

Stats

  • Locked Locked
  • Replies 16
  • Subscribers 143
  • Views 18443
  • Members are here 0
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

What is the most efficient approach to record each line from a input file ?

Charley Chen
Charley Chen over 14 years ago

 Hi All,

I use table to record each line & its value , but it will become slower and slower when time left .

Though I can get table[A11100] = list(1000000 2000000 3000000 4000000) very quickly , But must when it finished loop.

What is the best way to do it ?

;write a template file
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 
        getCurrentTime()
        outPort = outfile("test")
        count = 1000000
        fileCount = 1
        for(i 1 count
            fprintf(outPort "1000000 2000000 3000000 4000000\n")
        );   
        close(outPort)
        getCurrentTime()
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 

        ;read each line to record
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 
        nextLine = nil
        inPort = infile("test")
       getCurrentTime()
 when(inPort
      fileCount = 1
      table = makeTable("table" nil)
      while(gets(nextLine inPort)
              qq = parseString(nextLine "\n")
              str = sprintf(nil "A%d" fileCount)
              table[str] = list(1000000 2000000 3000000 4000000)
              fileCount++
      );while   
 );when
      inPort = nil
      getCurrentTime()
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

 

Thank you,

Charley

  • Cancel
  • Andrew Beckett
    Andrew Beckett over 14 years ago

    Charley,

    As I've mentioned several times - the SKILL profiler would help. In general it would be best to profile your real code (I presume this is a toy example), but when I profiled the read above, the time was mostly spent in gc (garbage collection). This I ran with a local file, and it spent 377 seconds out of 383 (in IC5141) in gc. Note that gc does not mean necessarily that there's garbage, but when it needs to allocate a chunk more memory for various objects it will first try to scan and collect garbage before allocating new space. 

    Because you're creating a lot of string and list cells (1 million and 4 million respectively), I can tell SKILL to pre-allocate memory for these. If I add:

    needNCells('list 4000000)
    needNCells('string 1000000)

    before reading the file, the reading loop takes 5.6 seconds (including population of the table).

    Note that in IC615 (not sure quite when in IC61 it happened) work was done on the chunk sizes so that it spends less time in gc - it was reduced to 105 seconds. But with the preallocation, it drops down to nothing in gc.

    You can also see the effect of this if you read the file twice - that's because the table will get recreated and all the previous keys and content will become garbage and reclaimed - once - but it then already has plenty allocated.

    gcsummary() is useful to find out what is allocated.

    Regards,

    Andrew.

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Charley Chen
    Charley Chen over 14 years ago

    Andrew,

    We don't have the license, so I can't use it. // Tools - SKILL Development

     

    How to set it correct number ?

    In this case ,  table[str]  must dependent the line count of a file , e.g.  wc -l  qq => 1000000 ,

    if wc -l qq => 10000000 (10 million )  , needNCells('string 10000000)  , Is that right ?

    If table[str] = list(1000000 2000000 3000000 4000000 500 600) , needNCells('list 60000000) // 60 million , Is that right ?

    What does 10 million means ?  I just give it a large number , it crashed.

    Thank you,

    Charley

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Andrew Beckett
    Andrew Beckett over 14 years ago

    Charley,

    You don't necessarily need to get it to be the number of lines - it will just minimize the amount of work it has to do.

    Allocating 10 million (or 60 million) of anything is a fairly unusual thing to do. Given that only you know what you're doing here - only you can know how you might be able to determine how many entries you need.

    The number is the number of slots needed. In the case of the strings, you had one string per line in the file (the key into the table). So for a million lines, that's a million strings. For the lists, the number is the number of list cells - and so there's one per list cell per row - hence 4million. gcsummary() tells you how big each type is, and how much total memory has been allocated for each type. As more are needed though, it will automatically allocate more memory - assuming you don't run out.

    If you start to allocate very large numbers, maybe you're running out of memory - it has to store the data somewhere. Particularly if you're running in 32 bit mode, there's a limit of roughly 2^32 bytes (less than 4Gbytes in practice) of available memory - even if your machine has much more. I don't know which version you're using, but in IC5141 only "layout" can be run in 64 bit mode (which would give you access to more memory). From IC614 the "virtuoso" executable can be run in 64 bit mode.

    Regards,

    Andrew.

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Charley Chen
    Charley Chen over 14 years ago

    Andrew,

    I need to collect each line & its data for look up acording each line information , using table is the best ?

    I know the prblem is in    table[str] = list(1000000 2000000 3000000 4000000)  ,

    as long as i comment it out , It become very fast . But nothing to get .

    Thank you,

    Charley

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Andrew Beckett
    Andrew Beckett over 14 years ago

    Charley,

    Yes, a table would be the best solution almost certainly, as it's a hash table and offers near random-access lookup times (compared with lists which are sequential).

    However, you've got to store the data somewhere - and the needNCells() is just a way of giving SKILL a hint how much data you're likely to be creating, to avoid it having to incrementally check for garbage and then allocate chunks as it goes along. These are the kind of trade-offs that you have to make and balance carefully when you are optimizing for large data.

    Andrew.

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Charley Chen
    Charley Chen over 14 years ago

    Andrew,

    Once I use needNCell('list 1000000) or needNCell('string 1000000..)  , Will it always available ?

     I will change and repeat using needNCell while working,

    Should I purge it when the program is done ? I am afraid I allocate too memory ....

    Thank you,

    Charley

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Andrew Beckett
    Andrew Beckett over 14 years ago

    Charley,

    Calling needNCell() multiple times is not a problem - it does not add that number, but sets the upper limit. This means you cannot reduce it once allocated, because  you've pre-allocated a pool ready to be used by SKILL. However, SKILL will be able to reuse any of the list or string cells which are no longer in use (for list cells or strings respectively) - this is what garbage collection is all about. For example, before I run your example reading 1 million lines into a table, if I run gcsummary() :

    -----------------------------------------------------------
    Type       Size   Allocated     Free      Static   GC count
    -----------------------------------------------------------
    ...
    list         12     1593344    26124     1191936          3
    ...
    string        8      167936    81680      229376          0

    After running it, I see:

    -----------------------------------------------------------
    Type       Size   Allocated     Free      Static   GC count
    -----------------------------------------------------------
    ...
    list         12    50089984   719256     1191936          6
    ...
    string        8     8966144   818144      229376          3

    So if once you're done you do:

    table=nil
    gc() ; force a garbage collection rather than waiting until the system thinks one is necessary
    gcsummary()

    -----------------------------------------------------------
    Type       Size   Allocated     Free      Static   GC count
    -----------------------------------------------------------
    ...
    list         12    50089984 48842892     1191936          7
    ...
    string        8     8966144  8859424      229376          3

    So what you're seeing there is now it has freed up lots of list and string slots - which can be reused by other programs within the same session.

    Hope that helps!

    Regards,

    Andrew.

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Charley Chen
    Charley Chen over 14 years ago

    Andrew,

    ;Calling needNCell() multiple times is not a problem ...

    (1)

    Can I use QQ() to define?

    procedure( QQ()

    needNCell('list 10000..)

    needNCell('string 10000..)

    ..

    );pro

    (2) Use two or three table to store the same value is better than use one table to store ?

    Thank you,

    Charley

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Andrew Beckett
    Andrew Beckett over 14 years ago

     Charley,

    The answer to question 1 is yes, it's fine to call it inside a procedure (in fact that was what I was doing when I was checking your code).

    I don't understand the second question. You'll need to give some more details for me to be able to answer.

    Andrew.

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
  • Charley Chen
    Charley Chen over 14 years ago

    Andrew,

    I table = makeTable("table" nil) to store 1 million record ,  If not use needNCell  , it will become slower later ...

    Can I use table1to1000 = makeTable("table1to1000" nil)  ; table1001to2000 = makeTable("table1001to2000" nil) ....

    Is that helpful ?

    Thank you,

    Charley

    • Cancel
    • Vote Up 0 Vote Down
    • Cancel
>

Community Guidelines

The Cadence Design Communities support Cadence users and technologists interacting to exchange ideas, news, technical information, and best practices to solve problems and get the most from Cadence technology. The community is open to everyone, and to provide the most value, we require participants to follow our Community Guidelines that facilitate a quality exchange of ideas and information. By accessing, contributing, using or downloading any materials from the site, you agree to be bound by the full Community Guidelines.

© 2025 Cadence Design Systems, Inc. All Rights Reserved.

  • Terms of Use
  • Privacy
  • Cookie Policy
  • US Trademarks
  • Do Not Sell or Share My Personal Information