Archived
1
0
Fork 0
dlight is a program that checks items in rss feeds and download those items/links that are matched against a set of rules.
This repository has been archived on 2026-05-10. You can view files and clone it, but you cannot make any changes to it's state, such as pushing and creating new issues, pull requests or comments.
Find a file
Henrik Hautakoski 1350330dd2 dlhist: rename to proc-cache.
A new datastructure is about to take dlhist place. dlhist is currently
implemented as a mixture of an "process cache" that should record what
rss items has been processed (that is why the url is used as a unique
identifier), but right now it only stores an url if it has been
downloaded. A new datastructure that should be "download history"
shall be implemented, that will keep track of what title and where
it has been downloaded to. this will make it possible to only
download an rss title to a location once.

Splitting this datastructure into two separated structures is trivial
as a "process cache" will threat URL's as a unique identifier and
a "download history" will threat the title in an rss item as a
unique identifier (and also track it's destinations).

This commit does not change any functionality, I just rename
this to keep the "dlhist" prefix and source files clear for
when implementing the real dlhist.
2011-11-14 16:11:18 +01:00
.gitignore gitignore: ignore executeables and config file 2011-09-21 17:19:09 +02:00
buffer.c buffer: adding buffer_cstr_release 2011-11-05 15:18:19 +01:00
buffer.h buffer: adding buffer_cstr_release 2011-11-05 15:18:19 +01:00
cconf.c compiled file format: associate destination whit each invidual filter 2011-09-21 17:19:09 +02:00
cconf.h compiled file format: associate destination whit each invidual filter 2011-09-21 17:19:09 +02:00
compile.c Make the other modules use error.c 2011-09-21 17:19:10 +02:00
config.sample compiled file format: associate destination whit each invidual filter 2011-09-21 17:19:09 +02:00
COPYING License with GPLv2 2011-09-21 17:19:09 +02:00
dlight.c dlhist: rename to proc-cache. 2011-11-14 16:11:18 +01:00
env.c env.c: use buffer.h 2011-11-05 15:18:19 +01:00
env.h License with GPLv2 2011-09-21 17:19:09 +02:00
error.c Adding error.c/.h 2011-09-21 17:19:10 +02:00
error.h error.h: Incorrect name in function prototype 2011-09-29 23:25:05 +02:00
filter-check.c adding command line tool for checking an regex 2011-09-21 17:19:11 +02:00
filter.c filter.c: compile: Oops, assigning function to char pointer 2011-10-25 16:57:48 +02:00
filter.h License with GPLv2 2011-09-21 17:19:09 +02:00
http.c http.c: use buffer.h 2011-11-05 15:18:19 +01:00
http.h http.c: use buffer.h 2011-11-05 15:18:19 +01:00
lockfile.c lockfile.c: missing stdlib.h for atexit() 2011-09-24 13:53:18 +02:00
lockfile.h lockfile: expose locked() 2011-09-24 13:03:22 +02:00
Makefile dlhist: rename to proc-cache. 2011-11-14 16:11:18 +01:00
proc-cache.c dlhist: rename to proc-cache. 2011-11-14 16:11:18 +01:00
proc-cache.h dlhist: rename to proc-cache. 2011-11-14 16:11:18 +01:00
read-config.c Make the other modules use error.c 2011-09-21 17:19:10 +02:00
README Small fix in README 2011-10-17 14:34:23 +01:00
rss.c License with GPLv2 2011-09-21 17:19:09 +02:00
rss.h License with GPLv2 2011-09-21 17:19:09 +02:00

	  Dlight - automatic feed downloader

	--------------------------------------

dlight is a program that checks items in rss feeds and download those
items/links that are matched against a set of rules.
What this does different than other programs of this type is that configuration
of the program should be easy and flexible. Not forcing users to write
and maintain large lists of raw regular expressions.

The best way to use dlight is by using time-based scheduling like cron.
--------------------------------------------------------
# Make cron execute dlight every 15 minutes
*/15 * * * * /path/to/dlight >> /path/to/logs/dlight.log
--------------------------------------------------------


dlight is divided into 3 major components: the dlight program,
Configuration files and Compiler.

	* dlight

The actual program that checks feeds and download items.
The configuration data is read from the compiled config file "~/.dlight/config".

The program first fetches the rss file (target), walks through
all items applying all filters associated whit the current target.
And if one matches, that item will be downloaded to the destination associated
with the target. it does this for all rss files (targets) in the config.

	* Configuration Files

A set of human-readable configuration files that the user should configure
dlight through. This is where users defines their targets, destinations,
filters and other types of information. (currently there is only one file
with a similar structure that the compiled format use).

	* Compiler

An compiler is provided that compiles configuration files down to a
binary config file used by dlight, one can think of this step as publish/update
the configuration used by the program.


This design is used for two main reasons.

One, if you edit your configuration structure and dlight would be executed by
for example cron. if dlight would read directly from those files, it is possible
that the configuration files are not in a desired state and making dlight
do some weird things.

The second reason is that processing all those files everytime dlight is invoked
can be quite slow, the compiled format is designed to provide fast I/O reads.
Also by using a source -> compiler -> output design, errors can be caught
in the configuration files when the user invokes the compiler.
which is a more natural way of notify the user on such errors then to
have dlight abort and log the error. Because the program is supposed to be
executed in an automatic manner, the error will not be seen right away.