Archived
1
0
Fork 0
Commit graph

68 commits

Author SHA1 Message Date
Henrik Hautakoski
13330c322d buffer.c: generate an warning instead of error in buffer_write() 2012-07-17 10:43:20 +02:00
Henrik Hautakoski
313816c425 error.c: let warn() return an int 2012-07-17 10:42:41 +02:00
Henrik Hautakoski
dea5aa6fe0 Makefile: dont optimize when compiling with '-g' flag. 2012-07-12 10:34:22 +02:00
Henrik Hautakoski
52777c2ff8 buffer.c: fix 'off-by-one' error when indexing array. 2012-07-04 14:01:04 +02:00
Henrik Hautakoski
d18d581fd0 buffer.c: fix dereference of invalid pointer.
using the pointer 'b->block' when it is possible that
reallocation has moved the memory to another location.
'b->block' may therefore be an invalid pointer in some
cases. use 'ret' intead.
2012-07-04 13:53:33 +02:00
Henrik Hautakoski
ae91737fb1 proc-cache.c: lazy delete
Use lazy delete mechanism to remove the need for linked list delete.
This removes the need for a sentinel node at the beginning.
2012-07-04 12:02:42 +02:00
Henrik Hautakoski
51f7286ab8 dlight.c: move call to proc_cache_update()
proc_cache_update() should be called after we have walked through
the list of filters. Not on every filter.
2012-06-28 17:06:42 +02:00
Henrik Hautakoski
d49b1b2456 dlight.c: Mark item in download history earlier.
It is more correct to mark an item as downloaded right after
it is actually fetched and not after it is successfully written to disk.
2012-06-28 17:02:14 +02:00
Henrik Hautakoski
365b657f9f proc-cache.c: use hash-table code from hash.c
Use the refactored code from hash.c also
use chaining as the collision strategy instead of
open-adressing, not only becouse the new hash api makes it hard
to do but it is more space efficient.

Since a collision with open-adressing results in two entries
in the hash table but with chaining, we only have one.
the complexity for search/insert/delete is still O(n) for both techniques.
Chaining is better because items that collide only takes up one slot in the
hash table, considering that the best-case for space overflow is 25%. it
is better to have a small table.
2012-05-22 11:48:28 +02:00
Henrik Hautakoski
8a39596268 llist.h: Adding a Single linked list implementation
Needing this for implementing chaining on hash collisions and
iam tired of implementing dynamic arrays / linked lists every time.
2012-05-21 21:15:27 +02:00
Henrik Hautakoski
269ddffa78 dlight.c: use buffer_write() 2012-05-23 22:05:16 +02:00
Henrik Hautakoski
4c664f2400 buffer: add buffer_write() 2012-05-23 22:05:16 +02:00
Henrik Hautakoski
e0976e0d23 proc-cache.c: remove proc_cache_flush()
flush() is redundant, it makes more sense to just write the file on close().
There is no reason why you want to commit the current state of the cache to disk
at any other time then when closing the application.
2012-04-17 21:14:49 +02:00
Henrik Hautakoski
4f05a5ae4f proc-cache.c: use hash-table code from hash.c 2012-04-12 07:46:28 +02:00
Henrik Hautakoski
5193e8453c hash.c: refactor out common hash table code
both dlhist.c and proc-cache.c uses similar hash table code.
factor this out into a helper interface.
2012-04-11 21:09:28 +02:00
Henrik Hautakoski
c1198121bb cconf.c: use sha1_io.h 2012-02-17 23:30:49 +01:00
Henrik Hautakoski
0b2b159b2d Adding sha1_io.c/h: A wrapper interface for performing posix file I/O
with SHA1 as a CRC mechanism.

When writing file formats using SHA1 as CRC, its is handy to
have SHA1_Update() to be applied to every write(). so that an
SHA1 hash can be calculated for that data and used as an CRC check.

Therefor this interface is created to wrap the code used to do this.
2012-02-17 22:37:24 +01:00
Henrik Hautakoski
37ba894802 make dlight.c use the new dlhist 2011-12-15 13:35:13 +01:00
Henrik Hautakoski
40e6afeffa adding new dlhist.c 2011-12-15 13:29:23 +01:00
Henrik Hautakoski
1350330dd2 dlhist: rename to proc-cache.
A new datastructure is about to take dlhist place. dlhist is currently
implemented as a mixture of an "process cache" that should record what
rss items has been processed (that is why the url is used as a unique
identifier), but right now it only stores an url if it has been
downloaded. A new datastructure that should be "download history"
shall be implemented, that will keep track of what title and where
it has been downloaded to. this will make it possible to only
download an rss title to a location once.

Splitting this datastructure into two separated structures is trivial
as a "process cache" will threat URL's as a unique identifier and
a "download history" will threat the title in an rss item as a
unique identifier (and also track it's destinations).

This commit does not change any functionality, I just rename
this to keep the "dlhist" prefix and source files clear for
when implementing the real dlhist.
2011-11-14 16:11:18 +01:00
Henrik Hautakoski
90365e9de2 dlight.c: do not call perror when dlhist_open() fails 2011-11-05 16:45:43 +01:00
Henrik Hautakoski
dcd515b4b1 http.c: use buffer.h 2011-11-05 15:18:19 +01:00
Henrik Hautakoski
6456ac58bc env.c: use buffer.h 2011-11-05 15:18:19 +01:00
Henrik Hautakoski
710f761cc6 buffer: adding buffer_cstr_release 2011-11-05 15:18:19 +01:00
Henrik Hautakoski
5245d19d71 moving strbuf to buffer 2011-11-05 15:18:19 +01:00
Henrik Hautakoski
5220e42038 Adding strbuf.
lifting a portion of the strbuf api/implementation found in archived.
This will serve as a base for an generic buffer api.
2011-11-05 15:18:19 +01:00
Henrik Hautakoski
378de035de dlhist.c: dlhist_lookup: pass a variable to he_empty().
he_empty() is a macro, do not pass a function call as argument.
when the macro expands, the function will be called numerous times.
2011-11-03 14:36:10 +01:00
Henrik Hautakoski
a0f46daa4a dlhist.c: store number of entries in header instead of table size
Now that table size can be calculated, lets store the number of entries
instead of size in the header so we can rely on that when reading
entries, instead of the actual size on disk. this is safer if data is
appended to the file outside of the application.
2011-11-03 13:42:59 +01:00
Henrik Hautakoski
bbefd9daf5 dlhist.c: calculate initial size when read table from file.
Now that records are fixed size, it's easy to calculate the number of
entries in the file. use that to calculate how large the hash table
should be.
2011-11-03 13:10:30 +01:00
Henrik Hautakoski
9517d28f72 dlhist.c: do linear-probing when inserting entries.
Somehow I apperently missed to do linear probing in he_insert that
results in colliding entries read from file (and when resizing)
to be dropped on the floor.

Lets not drop things on the floor anymore, certainly there is
another place in the table that will do fine instead of just
giving up and throw it on the floor.
2011-10-26 16:48:29 +01:00
Henrik Hautakoski
113dc524ed Small fix in README 2011-10-17 14:34:23 +01:00
Henrik Hautakoski
c41f11e57a dlight.c: do not add a newline to error() calls 2011-10-25 16:58:24 +02:00
Henrik Hautakoski
e3bd4545a2 filter.c: compile: Oops, assigning function to char pointer
info->msg is being assigned to 'error'. but there is no such variable.
altho there is such a function in error.h

fix this by assigning info->msg to 'err' instead, that is the variable
passed to pcre_compile().
2011-10-25 16:57:48 +02:00
Henrik Hautakoski
e39b9d64c6 env.c: use error.h 2011-10-12 14:15:08 +02:00
Henrik Hautakoski
6758b06188 dlhist: use sha1 hashes as keys to make records fixed size.
use sha1 hashes instead of c-strings to make records fixed size.
because it's hard to find collisions in sha1 hashes, this works well
in practise. And dynamic memory allocation for the variadic size keys
is not needed anymore. space is also reduced due to most key strings being
more than 20 bytes long.

calculating sha1 should be fast enough to not make any more overhead
than dynamic memory allocation did.
2011-10-04 15:29:45 +02:00
Henrik Hautakoski
1f46350f84 dlight.c: don't use errno from http_fetch_file
http_fetch_file does not set errno. so dont use it.
2011-10-02 22:42:34 +02:00
Henrik Hautakoski
356cddf07d error.h: Incorrect name in function prototype 2011-09-29 23:25:05 +02:00
Henrik Hautakoski
02814cc1f7 dlight.c: only download a file once.
When going through the filter list for an item, we download and store the item
everytime a filter is matched.
This patch allowes an item to be downloaded the first time a filter
match and save the data throughout the rest of the list, so all
other matches never downloads the item again but uses the data in memory.
2011-09-29 21:34:34 +02:00
Henrik Hautakoski
eb0702cd93 http: Adding http_fetch_file().
Sometimes, you want to fetch a file in memory so you can
store it on multiple places on disk whitout having to download it
again or copy files. while http_fetch_page works for fetching data
in memory, the possible filename found in the 'Content-Disposition'
header-feild is not accounted to.

http_fetch_file() fetches the data and store it in memory while trying to
get ahold of the filename.
2011-09-29 19:54:02 +02:00
Henrik Hautakoski
ad4ac41b13 lockfile.c: missing stdlib.h for atexit() 2011-09-24 13:53:18 +02:00
Henrik Hautakoski
611409c777 lockfile: expose locked()
usefull to have this macro in the interface so other modules using lockfiles
can check if a certain lock is active.
2011-09-24 13:03:22 +02:00
Henrik Hautakoski
2f9e968717 dlight.c: more verbose output 2011-09-20 16:33:55 +02:00
Henrik Hautakoski
6a4d6af99d lockfile.h: small typo in headercomment 2011-09-12 20:40:52 +02:00
Henrik Hautakoski
5eda6e7705 dlhist.c: resize_table: Oops, storing floating point value in integer. 2011-09-21 17:19:12 +02:00
Henrik Hautakoski
fce177b486 dlight.c: dlhist purge: remove magic number and set interval to 6h 2011-09-21 17:19:12 +02:00
Henrik Hautakoski
90505d8e5b remove new-line from error() calls. 2011-09-21 17:19:12 +02:00
Henrik Hautakoski
9767be013a http.c: refactor http request code.
remove duplicated curl code in 'fetch_page' and 'download_file'
and put it in an helper function.
2011-09-21 17:19:12 +02:00
Henrik Hautakoski
8cf883eb29 http.c: remove new-line from error() calls
one new-line is always appended to the message by the error function.
2011-09-21 17:19:11 +02:00
Henrik Hautakoski
c1a7c9671d http.c: skip memory storage when download files
don't need to store the http data in memory when downloading files, write to disk directly.
2011-09-21 17:19:11 +02:00
Henrik Hautakoski
3aea2adbc6 dlight.c: print items that are downloaded 2011-09-21 17:19:11 +02:00