Untitled

Dear Wei Kang, Shubham Goyal, Anthony Tung,

We are sorry to inform you that your paper

55 Hierarchical Summarization of Tweets Based on DBpedia Ontology

has not been accepted for the ACM SIGIR 2015 Conference.  As usual, competition
was very strong -- of the 351 papers submitted, only 70 (20%) were accepted.

Three or more referees reviewed your paper. A member of the Program Committee
then led a discussion, following which they wrote a summary meta-review.
A second member of the Program Committee double-checked the reviews and the
discussions, and in some cases, provided additional input.  You should be
able to identify five distinct reviews in the information that is provided
below, matching these three different roles.

In some cases further summary reviews were added as a result of the discussion
that took place at the PC Meeting, held on 10 and 11 April.

The reviews and meta-reviews for your paper are included in the remainder of
this email.  Please read them carefully; they provide feedback and suggestions
that should allow you to improve your paper.  Based on the reviews, you
may wish to revise your paper to address the concerns raised by the SIGIR
referees, and consider submission to another venue.  Depending on the topic
of your paper and the level of support it received via the reviewer scores,
you might consider submission to CIKM (http://www.cikm-2015.org/) or ICTIR
(http://ictir2015.org/) both of which have submission deadlines in early May.
SIGIR 2015 workshops (http://www.sigir2015.org/callforpapers/workshops)
and SIGIR SIRIP (http://www.sigir2015.org/callforpapers/industrialtrack)
provide other alternatives.

Note that in addition to the workshops, research papers and demos, there is
also an excellent program of workshops and tutorials planned, which offer
further opportunities for you to participate in SIGIR 2015.

Thank you for your submission to SIGIR.

Best regards,


-- Alistair, Berthier, Mounia

------------- Review from Reviewer 1 -------------
Relevance to SIGIR (1-5, accept threshold=3)  : 3
Originality of Work (1-5, accept threshold=3)
 : 3
Technical Soundness (1-5, accept threshold=3) : 3
Quality of Presentation (1-5, accept threshold=3)  : 3
Impact of Ideas or Results (1-5, accept threshold=3) : 2
Adequacy of Citations (1-5, accept threshold=3) : 3
Reproducibility of Methods (1-5, accept threshold=3) : 3
Overall Recommendation (1-6)                  : 3

-- Comments to the author(s):
This is the meta-review by the Primary PCM responsible for your paper, and takes into account the opinions expressed by the referees, the subsequent decision thread, and my own opinions about your work.


Reviewers found several problems in this paper, as noted in their reviews. An important concern shared by all three reviewers is the lack of an explicit evaluation of the quality of the summaries. A comparison with similar systems, not only in terms of scalability/efficiency, but in terms of effectiveness/quality, is necessary.


My recommendation is to reject the paper.
-- Summary:
Primary PCM recommends rejecting the paper.
---------- End of Review from Reviewer 1 ----------


------------- Review from Reviewer 2 -------------
Relevance to SIGIR (1-5, accept threshold=3)  : 3
Originality of Work (1-5, accept threshold=3)
 : 3
Technical Soundness (1-5, accept threshold=3) : 3
Quality of Presentation (1-5, accept threshold=3)  : 3
Impact of Ideas or Results (1-5, accept threshold=3) : 2
Adequacy of Citations (1-5, accept threshold=3) : 3
Reproducibility of Methods (1-5, accept threshold=3) : 3
Overall Recommendation (1-6)                  : 3

-- Comments to the author(s):
As Secondary PCM I have reviewed the reviews and the subsequent discussion, and I concur with the recommended decision.


-- Summary:
As Secondary PCM I have reviewed the reviews and the subsequent discussion, and I concur with the recommended decision.


---------- End of Review from Reviewer 2 ----------


------------- Review from Reviewer 3 -------------
Relevance to SIGIR (1-5, accept threshold=3)  : 3
Originality of Work (1-5, accept threshold=3)
 : 3
Technical Soundness (1-5, accept threshold=3) : 4
Quality of Presentation (1-5, accept threshold=3)  : 2
Impact of Ideas or Results (1-5, accept threshold=3) : 3
Adequacy of Citations (1-5, accept threshold=3) : 4
Reproducibility of Methods (1-5, accept threshold=3) : 3
Overall Recommendation (1-6)                  : 3

-- Comments to the author(s):
This is an interesting paper that tries to expose the relationships between entities extracted from tweets about a single topic. In its favor, the paper describes in detail an approach to link entities to class chains from the Wikipedia ontology and proposes to use this relation to built entity trees that provide an overview of the topic. However, the paper has some issues in its current form. Most notably, it is not clear what the authors are trying to evaluate during the experiments. Since the authors do not compare to a baseline for the chain generation task, we can not determine whether this is improving over approaches that exist. Meanwhile, during the summarization comparison against Vesta, only efficiency rather than effectiveness is tested.


I recommend that the authors try to identify the salient points that distinguish their approach from the literature and how they can show that this approach is objectively better/more effective/more efficient. For instance, if efficiency is the main selling point, then this should be discussed in more detail in the early sections. Meanwhile, if they are to claim that it is effective, then the need to compare it to similar systems, or if this is not possible through a user study.


Other comments:

 - It is not clear in in introduction whether producing these entity overviews are actually useful  - is there literature supporting this for summarization tasks?

 - When considering Twitter, timeliness is a key constraint to be considered, however  this is not discussed. Is the appropriate for real-time stream processing and do you think that Wikipedia will be updated fast enough to well match new entities as they emerge?

 - Section 4.1 seems to be a complication that could be removed. If you just use English examples later on  (Person -> Writer, Artist.... rather than C112, etc.) then it would make the paper easier to read.

 - Figure 6 would be easier to read as a table.

 - To make sense of the Vertical metrics we need to know what the class sizes are.

 - 6.2.4 seems out of place. Either you can state in the related work that these approaches are not suitable for use on Twitter or compare to Twitter-specific approaches.
-- Summary:
This is a paper that has some promise as an alternative way to provide an overview of an event. But the current experiments lack baselines to compare to and are incomplete.


Positive:

 - Detailed description of the proposed approach  stage by stage

 - Uses existing resources (Wikipedia ontologies) for a different domain (twitter summarization)

 - Generally well written


Negative:

 - Lacks the needed experimentation to show that their approach is effective

 - Lacks stated research questions

 - The approach could be better motivated
---------- End of Review from Reviewer 3 ----------


------------- Review from Reviewer 4 -------------
Relevance to SIGIR (1-5, accept threshold=3)  : 4
Originality of Work (1-5, accept threshold=3)
 : 3
Technical Soundness (1-5, accept threshold=3) : 2
Quality of Presentation (1-5, accept threshold=3)  : 4
Impact of Ideas or Results (1-5, accept threshold=3) : 2
Adequacy of Citations (1-5, accept threshold=3) : 2
Reproducibility of Methods (1-5, accept threshold=3) : 3
Overall Recommendation (1-6)                  : 3

-- Comments to the author(s):
This paper proposes to classify concepts extracted from twitter into pre-defined classes of the Dbpedia ontology. Such a tree-like presentation is claimed to be a hierarchical summarization of a set of tweets.


The challenge of the work is to map Wiki concepts extracted from Twitter into Dbpedia classes. Using infobox information as features, the authors use naive bayes classifiers to predict a class-chain for each concept along the Dbpedia ontology. The paper is well-written and the model is easy to follow.


The main concern is the representation. In the paper, the authors try to convince readers about the novelty and effectiveness of this hierarchical representation.


Such entity to ontology mapping itself is an existing research problem in Dbpedia (see "DBpedia: A Multilingual Cross-Domain Knowledge Base"). The Dbpedia team also utilize infobox to do a similar task. The difference from my point of view is that the authors in this paper are more on a context-dependent aspect that only care about the concepts extracted from the given tweets. But even in this case, the authors need to cite quite a few papers in information extraction community that are working on ontology derivations and entity-ontology mapping. The papers mentioned in related work are all about summarizations.


Meanwhile, from Dbpedia 3.7, the ontology changes from a tree structure to a DAG (the first class of each entity is the most notable one). In this paper, although the authors are using ver. 3.9, it seems they prefer to derive a tree instead of a DAG for summarization. It is interesting to see whether the framework can be extended to DAG.


Even we agree on the representation, there is no quantitive experiments about its performance at all. Only a case study is provided at the end of this work. At least some human evaluation needed.


The most interesting part is Sec. 6.2.4. I don't think naive bayes classifiers are such more efficient compared to all the other tree-based classification model. Sparseness can be easily solved for classification method in any mature packages. At lease some baselines, even not hierarchical classifiers, should be compared with the proposed method.


-- Summary:
The tree-based summarization representation needs to be carefully evaluated. Some related work in information extraction are missing. The mNBC should be evaluated thoroughly.
---------- End of Review from Reviewer 4 ----------


------------- Review from Reviewer 5 -------------
Relevance to SIGIR (1-5, accept threshold=3)  : 4
Originality of Work (1-5, accept threshold=3)
 : 3
Technical Soundness (1-5, accept threshold=3) : 2
Quality of Presentation (1-5, accept threshold=3)  : 2
Impact of Ideas or Results (1-5, accept threshold=3) : 2
Adequacy of Citations (1-5, accept threshold=3) : 3
Reproducibility of Methods (1-5, accept threshold=3) : 2
Overall Recommendation (1-6)                  : 2

-- Comments to the author(s):
This paper presents a method to organize tweets by linking them to DBPedia Ontology. The core idea of the paper is to use find entities mentioned in tweets and link them to their DBPedia classes.


The main idea behind the paper has some merit but there are significant issues with the framing of the problem, the evaluation and overall presentation.


The objectives of the three tasks described in section 5 and their motivations are confusing. For e.g., Section 5.2 describes summary generation as the task of "cutting off sub-hierarchies" from the ontology given a budget of top delta entities to include in a summary. There is no explanation given for why the budget for the summary is in terms of # of entities and not in terms of number of tweets that can be shown. This task can be viewed as a summarization task but what constitutes a good summary in this sense is unclear.


Similarly the entity selection task seems to be motivated because there are many entities that map to some classes. While this may be true, what makes an entity important within a class and how this relates to coherence o the classes involved is not clear.


The evaluation is entirely made of different variations of the proposed models. The quality of the summaries is not evaluated explicitly except for the comparison with respect to the classification accuracy. The external comparison to Vesta is in terms of scalability but not w/ respect to quality.


-- Summary:
Overall, many parts of the paper are hard to follow. The paper is pursuing a nice idea but needs significant improvements in many areas.
---------- End of Review from Reviewer 5 ----------