[ddj] Defining datajournalism

Antti Jogi Poikola antti.poikola at gmail.com
Thu Oct 13 21:38:41 UTC 2011


This has been interesting discussion to follow!

I'lle share my limited understanding and few questions that came to my mind
while reading the wikipedia articles and some other sources.

*Datajournalism*
http://en.wikipedia.org/wiki/Datajournalism
Is a umbrella term that covers other more specified terms related to the use
of data in journalism, most importantly "data driven journalism", "computer
assisted reporting", "database journalism", different flavors of visualizing
and possibilities to interactive journalistic applications etc.

-"It designates the increased amount of numerical data" -> what about text
analysis, algorithms for making sense of less structured data etc.


*Data-driven journalism (DDJ)*
http://en.wikipedia.org/wiki/Data-driven_journalism
Is a process that end up in publishing a data-article, preferably
accompanied with the raw data.

- "Data-driven journalism deals with open
data<http://en.wikipedia.org/wiki/Open_science_data> that
is freely available online and analyzed with open
source<http://en.wikipedia.org/wiki/Open_source> tools."
-> What about leaked data (not freely available)? What about collected data,
like large scale surveys made by a news outlet?

- The wikipedia entry (which needs lots of wikifying work) emphasises so
much visualization that I started to wonder is it possible to make a
data-article without using any kind of visualization?

- The wikipedia entry also focused so much to the data, that I started to
wonder if it is possible to do data-driven journalism, that actually doesn't
produce any stories, but only publishes collected, clened and  processed
data? Is storytelling essential part of data driven journalism? -> I would
say that it is...

-In my, opinion important characteristic of data-driven journalism is it's
investigative nature (investigative journalism). Therefore I would not be so
eager to include "real-time" or "timeliness" in the definition. I believe
that in not so distant future the fast braking news are produced only by
computers and algorithms, but there is still place for journalism that is
investigative and even slow (
http://www.najp.org/articles/2008/09/the-slow-journalism-movement-h.html ).
-> Can machine made news be called data driven journalism (or journalism at
all)?


*Computer assisted reporting (CAR)*
http://en.wikipedia.org/wiki/Computer_assisted_reporting
-Is a kind of old school, but widely used term, that covers collecting and
analysing data for journalistic purposes. I understand CAR -process as a
smaller subset of DDJ -process, since it doesn't cover the topics like
digital publishing and publishing the raw data.


*Database journalism*
http://en.wikipedia.org/wiki/Database_journalism
-I haven't yet read Holovaty and I really don't understand what database
journalism is or is not. From the wikipedia entry I got the idea, that it
would be more focused on the organization, information management, workflow
etc. than really getting any stories out and published -> That would make
database journalism as a organizational / methodological approach inside
data driven journalism?


*Online journalism*
http://en.wikipedia.org/wiki/Online_journalism
-Is this relevant at all?


-Jogi

On 13 October 2011 19:58, Lorenz Matzat <matzat at gmail.com> wrote:

> Ah, well. Now I understand; I was wondering about the destinction
> between datajournalism and ddj.
>
> I would choose digital journalism over data journalism in this case to
> avoid misinterpretations.
>
> Lorenz
>
>
>
> Am 13.10.11 16:47, schrieb Nicolas Kayser-Bril:
> > Hi Lorenz,
> >
> > Thanks for spreading the conversation in German!
> >
> > I agree with you that it's better when data plays a real role in the
> story,
> > whatever the form. Now, stories made with CAR (or even stories with a
> couple
> > of figures in them) have been described as "datajournalism", as well as
> > serious games that didn't have anything to do with data proper. That's
> why I
> > proposed a distinction between data-driven journalism, that represents
> your
> > definition, and datajournalism, that stands for "innovative stuff in
> > journalism that involves computers".
> >
> > Given how afraid traditional publishers are with all things digital, I
> > thought it might help to have a concept they wouldn't look down to.
> >
> > Best
> >
> > nicolas.
> > --
> > Datajournalist since 2007
> > nkb.fr <http://nkb.fr?m>
> > +336 50 57 53 80
> >
> >
> >
> > On Thu, Oct 13, 2011 at 3:58 PM, Lorenz Matzat <matzat at gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I wrote about this discussion a blog entry in German:
> >>
> >>
> >>
> http://www.datenjournalist.de/neues-von-der-definition-des-datenjournalismus/
> >>
> >> What I would like to add: For a data-article (as a product of
> >> datajournalism) - be it an interactive visualisation or whatever - I
> >> find the used data has to have a role in the story itself. Doing CAR or
> >> relying and researching in tables or numeric data is nothing new for
> >> journalism. And most articles etc. are using bits of numerical data or
> >> statistics - but that's not datajournalism.
> >>
> >> For me datajournalism is a journalistic product where the data plays a
> >> main role. Be it as the database for an app etc. or  the discussion or
> >> story about the data itself plays a significant role. And best is, when
> >> the underlying data is published as raw and open as possible alongside.
> >>
> >> So one question is, where does datajournalims start? Is CAR
> datajournalism?
> >>
> >> Lorenz
> >>
> >>
> >> Am 11.10.11 21:45, schrieb Tim McNamara:
> >>> Unstructured data are actually where I think journalists are able to
> >>> really excel. Leaks are their traditional fodder.
> >>>
> >>>
> >>> On 12 October 2011 05:12, marco Laucelli <mlaucelli at gmail.com> wrote:
> >>>> Hi all, sorry for the delay of my answer.
> >>>>
> >>>> Nicolas, I completely agree that getting close to real-time is a
> >>>> pan-journalisma topic. However my point was that the instrumentation
> of
> >> the
> >>>> world - producing mostly structured data - and the massive adoption of
> >>>> social/mobile networks by people - unstructured data - points out a
> good
> >>>> opportunity to get close to real-time. And, in my opinion, this is a
> >> very
> >>>> good opportunity to approach that pan-journalism challenge from
> d-driven
> >>>> journalism.
> >>>>
> >>>> I will try to write a proposal to include an unstructured data
> paragraph
> >> in
> >>>> the entry, and I'll share with all of you for feedback. Ok?
> >>>> Thanks a lot for your views and comments.
> >>>> Best regards,
> >>>> Marco.
> >>>>
> >>>>
> >>>> 2011/10/6 Nicolas Kayser-Bril <n.kayserbril at gmail.com>
> >>>>>
> >>>>> Marco,
> >>>>> Many thanks for your input.
> >>>>> There's no wonder we don't have the same definition: Scholars haven't
> >> been
> >>>>> able to agree on a universal definition for journalism for the past
> 50
> >>>>> years! (which is perfectly normal as Adrian Holovaty, Bob Woodward
> and
> >> the
> >>>>> anchor on Rossiya1 don't have anything but this name in common).
> That's
> >> why
> >>>>> I prefer using "information management".
> >>>>> Concerning the quest for real-time information, I believe it is a
> >>>>> pan-journalism issue. After I realized that the entry about
> journalism
> >>>>> didn't mention timeliness at all, I've added "in a timely fashion" at
> >> the
> >>>>> end of the definition ("Journalism is the practice
> >>>>> of investigation and reporting of events, issues and trends to a
> broad
> >>>>> audience in a timely fashion.")
> >>>>> As for unstructured data, you're absolutely right that structuring
> >>>>> heterogeneous bits of data is part of data-driven journalism (say,
> when
> >> you
> >>>>> OCRize a scanned paper to obtain an XLS file). Do you want to add a
> >>>>> paragraph about it in the DDJ entry?
> >>>>> best
> >>>>> nkb.
> >>>>> --
> >>>>> Datajournalist since 2007
> >>>>> nkb.fr
> >>>>> +336 50 57 53 80
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 6, 2011 at 5:41 PM, marco Laucelli <mlaucelli at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi all,
> >>>>>> this discussion is very exciting to me, and I would be happy to get
> >> your
> >>>>>> feedback on the following, that is just a summary of my thoughts of
> >> what
> >>>>>> DDJournalism means to me. In the usual definitions of Data
> Journalism
> >> I've
> >>>>>> found, as those cited in your previous mails, there are at least two
> >>>>>> important aspects I find lacking:
> >>>>>>
> >>>>>> 1) Getting closer to real-time: As the technology is evolving the
> >> world
> >>>>>> is becoming more and more instrumented and interconnected, as a
> >> consequence
> >>>>>> there would be a large amount of data created and transmitted in
> >> real-time,
> >>>>>> involving interactions and transactions among people, devices and
> >> physical &
> >>>>>> logical entities. The ability of developing capabilities to monitor
> >> that
> >>>>>> data traffic and detect noticeable events - based on data - seems to
> >> be one
> >>>>>> of the future scopes of Data Journalism. In the future it seems to
> me
> >>>>>> important considering News Media and journalist being able to
> detect,
> >>>>>> contextualize and rapidly analyze data-events and translate them
> into
> >>>>>> interesting news. Very simple example of this could be traffic
> events,
> >>>>>> emergencies, etc... Some would be noticed and published by public
> and
> >>>>>> private entities, but the technical ability and the skills to do so
> >>>>>> independently should be  in my opinion - one of the objectives of
> >> future
> >>>>>> journalism.
> >>>>>>
> >>>>>> 2) Use of unstructured data: there is a tremendous focus - in what
> >> refers
> >>>>>> most common DDJournalism references - on structured data, and in
> >> particular
> >>>>>> on structured data coming from public entities (Open Data). It seems
> >> to me
> >>>>>> very important for the future DDJournalism being able to capture
> >> relevant
> >>>>>> and noticeable information from unstructured data and in particular
> >> from
> >>>>>> digital conversations such those being held in social networks. The
> >> ability
> >>>>>> of structuring that information, and combining it with structured
> data
> >> is
> >>>>>> crucial - in my opinion, again - to get complete insight of what is
> >>>>>> happening behind data. I think that any definition of DDJournalism
> >> should
> >>>>>> take into account seriously unstructured data.
> >>>>>>
> >>>>>> I've put my views on the different flavors of DD Journalism in a
> chart
> >>>>>> explaining the conceptual links between time-scales and the data
> >> sources for
> >>>>>> DD Journalims. I'm currently working on a definition of the
> underlying
> >>>>>> conceptual architecture - tech capabilities, processes & skills - to
> >> support
> >>>>>> those different flavors. I would really appreciate any feedback on
> >> this, and
> >>>>>> I'm completely open to collaborations for this purpose.
> >>>>>>
> >>>>>> Thanks in advance for your interest.
> >>>>>> Kind regards,
> >>>>>> Marco.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2011/10/6 Nicolas Kayser-Bril <n.kayserbril at gmail.com>
> >>>>>>>
> >>>>>>> #done
> >>>>>>> Changelog:
> >>>>>>>
> >>>>>>> Creation of entry Datajournalism
> >>>>>>> Complete overhaul and change of meaning of Database journalism
> >>>>>>> Creation of Structured journalism, which redirects to Database
> >>>>>>> journalism
> >>>>>>> Re-creation of the page Computer-assisted_reporting, which, in my
> >>>>>>> opinion should be merged with data-driven journalism
> >>>>>>> Proposal of a merger of Computational Journalism with data-driven
> >>>>>>> journalism
> >>>>>>> I've also tried to clarify and wikify the Data driven journalism
> >> entry,
> >>>>>>> but that'll take time.
> >>>>>>>
> >>>>>>> All contribution/feedback/edit war welcome!
> >>>>>>> nkb.
> >>>>>>> --
> >>>>>>> Datajournalist since 2007
> >>>>>>> nkb.fr
> >>>>>>> +336 50 57 53 80
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Oct 6, 2011 at 12:12 PM, Nicolas Kayser-Bril
> >>>>>>> <n.kayserbril at gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Mirko, Tom,
> >>>>>>>> Many thanks for your feedback! It concurs to the idea of
> >> datajournalism
> >>>>>>>> as a byword for 'new stuff in information management', starting
> from
> >> data
> >>>>>>>> collection and how to envision a story to interactive apps.
> >>>>>>>> I'll include your points in the Wikipedia entry!
> >>>>>>>> Best
> >>>>>>>> nkb.
> >>>>>>>> --
> >>>>>>>> Datajournalist since 2007
> >>>>>>>> nkb.fr
> >>>>>>>> +336 50 57 53 80
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Oct 6, 2011 at 11:29 AM, Tom Kronenburg
> >>>>>>>> <tom.kronenburg at zenc.nl> wrote:
> >>>>>>>>>
> >>>>>>>>> Dear Nicholas,
> >>>>>>>>> I will publish a report on open data and datajournalism on the
> >>>>>>>>> ePSIplatform.eu
> >>>>>>>>> In it, I recognize 4 types of activities that i consider
> >>>>>>>>> datajournalism. (naturally, with any definition you draw lines
> that
> >> are a
> >>>>>>>>> bit arbitrary)
> >>>>>>>>>
> >>>>>>>>> " There are four basic types of data journalistic activities. All
> >> four
> >>>>>>>>> types can use PSI, and we will provide examples of how
> journalists
> >> used Open
> >>>>>>>>> Data to write their stories. Data journalists use (open) data
> >>>>>>>>>
> >>>>>>>>> To discover newsworthy facts or stories [from data]
> >>>>>>>>>
> >>>>>>>>> To discover trends hidden in [large] datasets
> >>>>>>>>>
> >>>>>>>>> To compile datasets for further dissemination to the public.
> >>>>>>>>>
> >>>>>>>>> To create data visualisations."
> >>>>>>>>> 1: is what you might consider CAR (even though i understand that
> >> CAR
> >>>>>>>>> is as much an umbrella-word as data-journalism).
> >>>>>>>>> 2: is different from 1, because the timing is different. I'd say
> >> the
> >>>>>>>>> first category is about a single event, while 2 is about trends.
> >>>>>>>>> 3: is what you call "Database Journalism" or structured
> journalism.
> >>>>>>>>> 4: I have swept together all visualization/interaction stuff in
> one
> >>>>>>>>> category: "Infographics, dataviz, interactive viz (for me the
> same
> >> as
> >>>>>>>>> dataviz, although with different tools) - same goes for serious
> >> games".
> >>>>>>>>> So, basically, i think we agree on the main points that are in
> >> there.
> >>>>>>>>> I don't really know whether or not distinguishing category's 1
> and
> >> 2
> >>>>>>>>> is important, but for me it feels like they are very different
> >> types of
> >>>>>>>>> activities. The first is 'searching' through datasets, combining
> >> single
> >>>>>>>>> lines, whereas trend discovery is much more about statistics,
> >> massive
> >>>>>>>>> computation and such.
> >>>>>>>>> When the report is published, i'll let you know.
> >>>>>>>>> Kind regards, Tom
> >>>>>>>>>
> >>>>>>>>> Tom Kronenburg
> >>>>>>>>>
> >>>>>>>>> Zenc | Focus op oplossingen
> >>>>>>>>> Alexanderstraat 18
> >>>>>>>>> 2514 JM Den Haag
> >>>>>>>>> KvK:  27190312
> >>>>>>>>> Tel:  +31 70 3626944 of +31 6 55778353
> >>>>>>>>> Fax:  +31 70 3921835
> >>>>>>>>> tom.kronenburg at zenc.nl
> >>>>>>>>> www.zenc.nl
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Op 6 okt 2011, om 10:30 heeft Nicolas Kayser-Bril het volgende
> >>>>>>>>> geschreven:
> >>>>>>>>>
> >>>>>>>>> Datajournalism has been widely used to unite several concepts and
> >> link
> >>>>>>>>> them to journalism. Among these are:
> >>>>>>>>>
> >>>>>>>>> Computer assisted reporting and data-driven journalism, where
> >>>>>>>>> journalists make use of large databases to produce stories,
> >>>>>>>>> Infographics,
> >>>>>>>>> Data visualization,
> >>>>>>>>> Interactive visualization,
> >>>>>>>>> Serious games, in the sense that they take interaction a step
> >> further,
> >>>>>>>>> and
> >>>>>>>>> Database journalism or structured journalism, an information
> >>>>>>>>> management system where pieces of information are organized in a
> >> database
> >>>>>>>>> (as opposed to a traditional story-centric organizational
> >> structure).
> >>>>>>>>>
> >>>>>>>>> I also plan to rework several entries, notably:
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> data-driven-journalism mailing list
> >>>>>>>>> data-driven-journalism at lists.okfn.org
> >>>>>>>>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> data-driven-journalism mailing list
> >>>>>>> data-driven-journalism at lists.okfn.org
> >>>>>>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> @mlaucelli
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> data-driven-journalism mailing list
> >>>>>> data-driven-journalism at lists.okfn.org
> >>>>>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> data-driven-journalism mailing list
> >>>>> data-driven-journalism at lists.okfn.org
> >>>>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> @mlaucelli
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> data-driven-journalism mailing list
> >>>> data-driven-journalism at lists.okfn.org
> >>>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> data-driven-journalism mailing list
> >>> data-driven-journalism at lists.okfn.org
> >>> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>
> >> --
> >>
> >> http://www.datenjournalist.de
> >> http://www.opendatacity.de
> >>
> >> Twitter: http://twitter.com/lorz
> >> LinkedIn: http://www.linkedin.com/pub/lorenz-matzat/31/b00/571
> >> Facebook: http://de-de.facebook.com/people/Lo-Rz/100001536331743
> >>
> >> public PGP:
> >> http://gpg-keyserver.de/pks/lookup?op=get&search=0x53601B9EB93E01BE
> >>
> >> Lorenz Matzat
> >> Medienkombinat Berlin
> >> Köpenickerstraße 187/188
> >> 10997 Berlin
> >>
> >> Tel. (030) 7891 3457
> >> matzat at gmail.com
> >>
> >> _______________________________________________
> >> data-driven-journalism mailing list
> >> data-driven-journalism at lists.okfn.org
> >> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> >>
> >
> >
> >
> > _______________________________________________
> > data-driven-journalism mailing list
> > data-driven-journalism at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/data-driven-journalism
>
> --
>
> http://www.datenjournalist.de
> http://www.opendatacity.de
>
> Twitter: http://twitter.com/lorz
> LinkedIn: http://www.linkedin.com/pub/lorenz-matzat/31/b00/571
> Facebook: http://de-de.facebook.com/people/Lo-Rz/100001536331743
>
> public PGP:
> http://gpg-keyserver.de/pks/lookup?op=get&search=0x53601B9EB93E01BE
>
> Lorenz Matzat
> Medienkombinat Berlin
> Köpenickerstraße 187/188
> 10997 Berlin
>
> Tel. (030) 7891 3457
> matzat at gmail.com
>
> _______________________________________________
> data-driven-journalism mailing list
> data-driven-journalism at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
>



-- 
Antti "Jogi" Poikola - +358 44 337 5439
--------------------------------------------
Q: Why is this email three sentences or less?
A: http://three.sentenc.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-driven-journalism/attachments/20111014/f3de1455/attachment-0001.html>


More information about the data-driven-journalism mailing list