-
Issue description is saved as varchar instead of text blob in database in order to be indexed and queried. Many databases have length limit on total varchars in a single table record.
-
I have inspected OneDev database (postgres):
o_issue.o_descriptionhas a length limit of14000and does not have any index.o_issuecomment.o_contenthas a length limit of15000and does not have any index.
Because there are no indexes I don't see a reason to use varchar for issue description and comments content. I guess you are using JPA and simple
LIKEqueries, right? I think a btree index would not help anyways in that case. For real full text search database specific features must usually be used (e.g. GIN/GIST indexes and tsvector/tsquery for postgres). At work I use JPA as well and all string columns are postgres text columns. Search queries work reasonably well if you can also add additional filters like project id or customer id. -
Sorry I forgot that. Using text column has its issues. Need to check further. Also OneDev needs to support different databases, and can not leverage specific feature of postgresql.
-
@robin You need to bare in mind that you are the only developer on the project currently, and although others have made contributions, maintaining a large number of databases is very time consuming. I believe you support both postgresql and mysql, they both serve similar functions for example, for a large enterprise solution they have a lot of devs to maintain each database connector, there is just you.
Might be worth setting up your priorities for database support, it will make maintaining the project easier :)
-
Right now, supporting multiple databases does not cost too much, thanks to hibernate. In future, if this does hold OneDev back, I may consider reducing database type support.
-
Checked it again, and the description column is using type
varcharinstead oftext/clobmostly because of performance on various database supported by OneDev, including MySQL/Oracle/SQLServer. -
Hmm and what is the performance bottleneck? What will be slow? Searching within description via JPA query?
Do you know
hibernate-search? It allows you to annotate entity fields and it will manage full text search indexes for you. As a backend you can either useelasticsearch/opensearchorlucene. Since you already useluceneit might be worth a look. Maybe it can help making the description columntext/clobor at least maxvarcharlength.https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#preface
-
Both for search and load. Hibernate search is not flexible enough, and OneDev actually indexes description itself with Lucene for fuzzy search (search text freely and return matched issues). The search we are discussing here is for grammar aware search ("Description" contains ...).
I don't think it is wise to sacrificing performance for some edge cases of long issue description.
-
@jbauer The maximum size of a VARCHAR is 255 characters, this is more than enough for a brief description of the code, and it is a lot more efficient than using a TEXT datatype.
Also why would you need a longer description, project description is meant to be brief, and then the information about the project is added to
README.mdThus yes, the description does need a length limit!
-
It is 14000 chars in OneDev.
-
@polarian
Wait what? I thought VARCHAR maxed out at 255?
The maximum size of a varchar column is highly database specific. For example MySQL as a limit of 65,535 bytes while PostgreSQL allows 10,485,760 bytes.
Also why you complaining about a 14,000 character limit, that is like an entire essay worth of characters!
I am talking about the description and not the title of an issue.
I am using Java and have a tool that automatically reports exceptions (along with some metadata) occurred in production to my old issue tracker. That metadata could for example be a user click stream that causes the exception. Because Java exceptions contain full qualified Java class names as well the file name (including path) a single stack frame (= one line) can easily be 100-150 characters long. User click streams also contain some meta data about where the user has clicked and these lines can also be longer. So at the end
14000 chars / <average line length>does result in roughly 100 lines. From these 100 lines you can subtract 40 lines of user click trace lines. 60 lines of stack frames can easily occur. So 14000 chars might sound a lot but for error reporting it can really be limiting.The plan is to rewrite that tool to directly push error reports to OneDev and get rid of the old issue tracker but the size limitation is a real blocker to do so. I might be able to split the error report into a title + description and then additional comments if description would be too long but it feels brittle to store the size limits of OneDev in a second tool. I then have to keep track of OneDev changes in case that limit changes. Ideally I want to avoid that.
Currently I manually migrate automatically created issue reports from the old system into OneDev and occasionally I have to shorten the reported data because of the above. If someone verified the migrated issue that person would also like to add a short section "steps to reproduce" to the description which also requires characters. Sure it could be a comment but it is easier to have everything at the top of the issue in one place when someone finally starts working on that issue.
-
Will do some benchmark to see if using long TEXT type is acceptable.
-
I am using Java and have a tool that automatically reports exceptions (along with some metadata) occurred in production to my old issue tracker.
Logs and Reports, especially long one, should be file attach to the issue and not in the RAW description. Better visibility for both developer and support, and saving performances.
Enlarge column size to TEXT may impact performances, just for this regular case.
-
Logs and Reports, especially long one, should be file attach to the issue and not in the RAW description. Better visibility for both developer and support, and saving performances.
I don't agree because text attachments are not searchable.
Generally I am fine with a limit but it must be unreasonably large for daily usage to not block you from using the tool. I see such limits for user input more like a last defense against attackers who want to fill the DB with huge amounts of data and not so much as a performance optimization because sooner or later you have to deal with performance anyway. It does not matter if you have 100,000 rows with each having a fully filled 10,000 character column or 50,000 rows each having a fully filled 20,000 character column. Sure you might get additional time until your critical size is reached (row number * column size) when limiting the column size but you will reach that point and you have to do proper performance optimization (table partitioning, full text search index and specialized SQL, etc..).
-
Generally I am fine with a limit but it must be unreasonably large for daily usage to not block you from using the tool. I see such limits for user input more like a last defense against attackers who want to fill the DB with huge amounts of data and not so much as a performance optimization because sooner or later you have to deal with performance anyway.
This is more the way you are using the issue system tracking than defense againts attacker.
In my opinion, the description field add some more precise infos about the issue : some lines of text describing the issue, screenshot or a relative amount of copy/paste data.
If you add logs or reports more than 100 lines, better to attach them to a file in the issue. Gain better visibily in the discussion, and hey, you're developper :
- open issue depending name, priority, state.
- reading description if title not clear.
- download the attach file to find the exception in the 10.000 lines to java stracktrace.
Browser is not for opening logs with thousands of lines at all.
Without proper organization on the way the support or automatic tools created issue, it's then the mess for developer to sort, find, understrand and fix.
-
Name Previous Value Current Value Type
Support Request
Improvement
-
Previous Value Current Value Does description really need a length limit?
Increase various description/comment length to sufficiently large
-
OneDev
changed state to 'Closed' 3 years ago
Previous Value Current Value Open
Closed
-
Now the length is increased to 100000 chars, which should be sufficient. Populated 100k issues and querying against description column can return result within 2 seconds for all supported databases, which should be acceptable.
In most cases, users are expected to perform fuzzy search against title/description/comment, which leverages lucene to do the job, and can return result within 10ms.
-
OneDev
changed state to 'Released' 3 years ago
Previous Value Current Value Closed
Released
-
State changed as build #3440 is successful
| Type |
Improvement
|
| Priority |
Normal
|
| Assignee |
Is there a reason why there is a length limit for the issue description? Can it be removed, increased or made configurable?
When an issue represents a bug and the description should contain the corresponding java stack trace, OneDev often complains that the description is too long because the stack trace is long. This is relatively annoying during migrating selected issues from a different software to OneDev because I then have to manually edit the stack trace and shorten it here and there hoping nothing relevant has been removed. Also when I migrating an internal tool that automatically reports issues including exceptions via API, this would be annoying as well in case the API call would fail.
I can see there might be some DoS potential when fully removing the limit but IMHO it should be set to something unreasonably large to never hit that limit in daily usage.